Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de
Knowledge-Driven Multilingual Text Analysis and Transparent Information Retrieval - Gregor Thurmair

Knowledge-Driven Multilingual Text Analysis and Transparent Information Retrieval

Language Technology for Industrial Applications

(Autor)

Buch | Hardcover
XIV, 393 Seiten
2025
Springer International Publishing (Verlag)
978-3-031-91740-0 (ISBN)
CHF 249,95 inkl. MwSt
  • Versand in 15-20 Tagen
  • Versandkostenfrei
  • Auch auf Rechnung
  • Artikel merken
This book presents all components and knowledge sources required for Transparent Information Retrieval. Depending on the respective topic and taking care of their interoperability, both deep and shallow technology is used. The processing starts from the analysis of the text data and collects its results in a multilingual conceptual network, this way enabling Transparent Information Retrieval where users communicate with the system in their native language while the documents could be in a different language, transparent to the users. To do so, the author investigates all text analysis components required for multilingual indexing, starting from preparatory work like language and topic identification, continuing with sentence splitting and tokenization (including Chinese), and describing lexical analysis, also for multiword entries and Named Entities. Entries are then disambiguated both on syntactic (by a tagger) and semantic level (by multilingual word sense disambiguation). The analysis results are collected in a dynamic multilingual ConceptNet, which is an index structure extended by monolingual relations (like synonyms, or head-modifier links) as well as multilingual ones (translations). In addition to many European languages also Turkish, Arabic, Persian, and Chinese are treated. The book concludes with a description of components needed to build the required resources, like crawlers, bilingual term extraction, and tools for defaulting linguistic annotations. For each component, readers will find a technology overview, a discussion of its main challenges in computational treatment, a description of the technical solution selected, and evaluation information.

Gregor Thurmair has a long history and experience in multilingual text processing and machine translation in industrial setups. Starting with the first retrieval and dialogue systems in the 80s, he worked as a researcher, project leader, and technical director both in the development of IR and MT systems (Siemens METAL, Linguatec s Personal Translator) and in Language Engineering projects for terminology, multilingual text analysis, and translation in several EU Projects. He has more than 50 publications; he was a member of the ELRA board, reviewer for the European Commission, and invited speaker in several conferences (LREC, CLEF, MTSummit).

Preface.- 1. System Design.- 2. TINA Analysis Strategy.- 3. Text Analysis Preprocessing.- 4. Text Segmentation.- 5. Lexical Analysis.- 6. Special Entries.- 7. Disambiguation.- 9. Transparent Information Retrieval (TIR) and the LtConceptNet.- 9. Resources.

Erscheinungsdatum
Reihe/Serie Cognitive Technologies
Zusatzinfo XIV, 393 p. 174 illus., 170 illus. in color.
Verlagsort Cham
Sprache englisch
Maße 155 x 235 mm
Themenwelt Informatik Theorie / Studium Künstliche Intelligenz / Robotik
Schlagworte conceptual network • Information Retrieval • Lexical Analysis • LtConceptNet • Multilingual Indexing • Named Entity Recognition • text analysis • Text Segmentation • Tina • Transparent Information Retrieval (TIR) • Word sense disambiguation
ISBN-10 3-031-91740-5 / 3031917405
ISBN-13 978-3-031-91740-0 / 9783031917400
Zustand Neuware
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
Mehr entdecken
aus dem Bereich
Eine kurze Geschichte der Informationsnetzwerke von der Steinzeit bis …

von Yuval Noah Harari

Buch | Hardcover (2024)
Penguin (Verlag)
CHF 39,95