Comparable Corpora and Computer-assisted Translation - Estelle Maryline Delpech

Blick ins Buch

Comparable Corpora and Computer-assisted Translation (eBook)

Estelle Maryline Delpech (Autor)

eBook Download: PDF

2014 | 1. Auflage
304 Seiten
John Wiley & Sons (Verlag)
978-1-119-00252-9 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (PDF)

Computer-assisted translation (CAT) has always used translation memories, which require the translator to have a corpus of previous translations that the CAT software can use to generate bilingual lexicons. This can be problematic when the translator does not have such a corpus, for instance, when the text belongs to an emerging field. To solve this issue, CAT research has looked into the leveraging of comparable corpora, i.e. a set of texts, in two or more languages, which deal with the same topic but are not translations of one another.

This work had two primary objectives. The first is to assess the input of lexicons extracted from comparable corpora in the context of a specialized human translation task. The second objective is to identify bilingual-lexicon-extraction methods which best match the translators' needs, determining the current limits of these techniques and suggesting improvements. The author focuses, in particular, on the identification of fertile translations, the management of multiple morphological structures, and the ranking of candidate translations.

The experiments are carried out on two language pairs (English-French and English-German) and on specialized texts dealing with breast cancer. This research puts significant emphasis on applicability - methodological choices are guided by the needs of the final users. This book is organized in two parts: the first part presents the applicative and scientific context of the research, and the second part is given over to efforts to improve compositional translation.

The research work presented in this book received the PhD Thesis award 2014 from the French association for natural language processing (ATALA).

Estelle Maryline Delpech holds a PhD in Computer Science from the University of Nantes in France, where she specialized in natural language processing and computer-aided translation. She is currently Chief Scientist at Nomao, a web and mobile app search engine company. Her research interests include multilingualism, computational linguistics, information extraction and data integration.

Acknowledgments ix

Introduction xi

Part 1 Applicative and Scientific Context 1

Chapter 1 Leveraging Comparable Corpora and Computer-Assisted Translation 3

Chapter 2 User-Centered Evaluation of Lexicons Extracted from Comparable Corpora 41

Chapter 3 Automatic Generation of Term Translations 67

Part 2 Contributions to Compositional Translation 99

Chapter 4 Morph-Compositional Translation: Methodological Framework 101

Chapter 5 Experimental Data 123

Chapter 6 Formalization and Evaluation of Candidate Translation Generation 139

Chapter 7 Formalization and Evaluation of Candidate Translation Ranking 179

Conclusion and Perspectives 199

Part 3 Appendices 205

Appendix 1 Measures 207

Appendix 2 Data 215

Appendix 3 Comparable Corpora Lexicons Consultation Interface 261

List of Tables 265

List of Figures 271

List of Algorithms 273

List of Extracts 275

Bibliography 277

Index 289

Erscheint lt. Verlag	22.7.2014
Sprache	englisch
Themenwelt	Geisteswissenschaften ► Sprach- / Literaturwissenschaft ► Sprachwissenschaft
	Mathematik / Informatik ► Informatik ► Programmiersprachen / -werkzeuge
	Mathematik / Informatik ► Informatik ► Software Entwicklung
Schlagworte	Computer Science • Informatik • Programmierung u. Software-Entwicklung • Programming & Software Development
ISBN-10	1-119-00252-4 / 1119002524
ISBN-13	978-1-119-00252-9 / 9781119002529

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

PDF (Adobe DRM)

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.