Evaluation by Alignment

A Framework for Robust End-to-End NLP Assessment

Jungyeul Park (Autor)

Buch | Hardcover

V, 120 Seiten

2026
Springer International Publishing (Verlag)
978-3-032-16562-6 (ISBN)

Noch nicht erschienen - erscheint am 11.03.2026
Versandkostenfrei
Auch auf Rechnung

Artikel merken

This book presents a novel, alignment-based evaluation framework that tackles a persistent challenge in natural language processing (NLP): how to fairly and accurately evaluate systems when preprocessing steps such as tokenization and sentence boundary detection (SBD) misalign between gold-standard and system outputs. By introducing the jointly preprocessed evaluation algorithm (jp-algorithm), this book proposes a solution that brings precision and flexibility to the assessment of modern, end-to-end NLP systems. Traditional evaluation methods assume identical sentence and token boundaries between references and hypotheses, making them poorly suited to real-world data and increasingly common end-to-end architectures. The jp-algorithm addresses these shortcomings by introducing a linear-time alignment strategy inspired by techniques in machine translation. This method allows for robust comparisons even when input segmentation differs, enabling reliable evaluation in tasks such as preprocessing, constituency parsing, and grammatical error correction (GEC). The book explores how misaligned preprocessing impacts standard evaluation metrics including PARSEVAL for constituency parsing and F0.5 for GEC and provides empirical solutions for preserving evaluation accuracy without sacrificing methodological integrity. By offering detailed case studies, formal algorithmic descriptions, and practical implementations, this book equips researchers, tool developers, and instructors with a generalizable framework for improving NLP evaluation practices. This book is intended for researchers, graduate students, and professionals working in NLP, corpus linguistics, and computational linguistics.

Jungyeul Park, Ph.D., a Research Professor at the Culture Technology Institute, Korea Advanced Institute of Science and Technology (KAIST), is a computational linguist and computer scientist specializing in syntactic parsing, formal grammar, and evaluation methodologies in natural language processing (NLP). His work includes the development of algorithms for converting dependency to constituency structures across English, Chinese, and Korean, as well as the creation of multilingual treebanks and linguistically informed evaluation frameworks. He has published extensively on parsing, corpus development, and grammatical error correction, and recently introduced an alignment-based evaluation algorithm that addresses mismatches in preprocessing critical for evaluating end-to-end NLP systems. His research blends theoretical precision with practical applications, particularly in multilingual and learner language contexts. Dr. Park collaborates widely on international projects and interdisciplinary initiatives that promote robust NLP tools and linguistic diversity. His work consistently aims to make language technologies more reliable and accessible across real-world settings.

Preface.- A Joint Preprocessing Framework for Evaluation.- Applications of JP-Algorithm.- Conclusion.

Erscheint lt. Verlag	11.3.2026
Reihe/Serie	Synthesis Lectures on Computer Science
Zusatzinfo	V, 120 p.
Verlagsort	Cham
Sprache	englisch
Maße	168 x 240 mm
Themenwelt	Mathematik / Informatik ► Informatik
Schlagworte	Alignment-based Evaluation • Constituency Parsing Evaluation • End-to-end NLP Systems • Grammatical Error Correction (GEC) • JP-algorithm • Multilingual NLP Evaluation • Natural Language Processing (NLP) • NLP Evaluation • NLP Metrics • Preprocessing in NLP • Sentence Boundary Detection (SBD) • Tokenization Mismatches
ISBN-10	3-032-16562-8 / 3032165628
ISBN-13	978-3-032-16562-6 / 9783032165626
Zustand	Neuware