Ontology Learning and Population from Text (eBook)
XXVIII, 347 Seiten
Springer US (Verlag)
9780387392523 (ISBN)
In the last decade, ontologies have received much attention within computer science and related disciplines, most often as the semantic web. Ontology Learning and Population from Text: Algorithms, Evaluation and Applications discusses ontologies for the semantic web, as well as knowledge management, information retrieval, text clustering and classification, as well as natural language processing.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications is structured for research scientists and practitioners in industry. This book is also suitable for graduate-level students in computer science.
In the last decade, ontologies have received much attention within computer science and related disciplines, most often as the semantic web. Ontology Learning and Population from Text: Algorithms, Evaluation and Applications discusses ontologies for the semantic web, as well as knowledge management, information retrieval, text clustering and classification, as well as natural language processing.Ontology Learning and Population from Text: Algorithms, Evaluation and Applications is structured for research scientists and practitioners in industry. This book is also suitable for graduate-level students in computer science.
Contents 7
List of Figures 11
List of Tables 13
Foreword 15
Preface 17
Acknowledgements 21
Abbreviations 25
Mathematical Notation 27
Part I Preliminaries 29
Introduction 30
Ontologies 35
Ontology Learning from Text 44
3.1 Ontology Learning Tasks 48
Basics 60
4.1 Natural Language Processing 60
4.2 Formal Concept Analysis 81
4.3 Machine Learning 87
Datasets 101
5.1 Corpora 101
5.2 Concept Hierarchies 103
5.3 Population Gold Standard 105
Part II Methods and Applications 106
Concept Hierarchy Induction 107
6.1 Common Approaches 108
6.2 Learning Concept Hierarchies with FCA 116
6.3 Guided Clustering 145
6.4 Learning from Heterogeneous Sources of Evidence 164
6.5 Related Work 178
6.6 Conclusion and Open Issues 204
Learning Attributes and Relations 207
7.1 Common Approaches 207
7.2 Learning Attributes 210
7.3 Learning Relations from Corpora 221
7.4 Learning Qualia Structures from the Web 229
7.5 Related Work 244
7.6 Conclusion and Open Issues 252
Population 254
8.1 Common Approaches 255
8.2 Corpus-based Population 259
8.3 Learning by Googling 270
8.4 Related Work 295
8.5 Conclusion and Open Issues 300
Applications 302
9.1 Text Clustering and Classification 304
9.2 Information Highlighting for Supporting Search 313
9.3 Related Work 320
9.4 Contribution and Open Issues 325
Part III Conclusion 327
Contribution and Outlook 328
Concluding Remarks 330
Appendix 332
A. l Learning Accuracy 332
A.2 Mutually Similar Words for the tourism domain 336
A.3 Mutually Similar Words for the finance domain 337
A.4 The Penn Treebank Tag Set 339
References 340
Index 363
10 Contribution and Outlook (p. 309-310)
This book contributes to the state-of-the-art in ontology learning in several ways. First, we have provided a formal definition of ontology learning tasks with respect to a well-defined ontology model. The ontology learning layer cake, a model for representing the diverse subtasks in ontology learning has been introduced. In addition, evaluation measures for the concept hierarchy induction, relation learning as well as ontology population tasks have been defined. These evaluation measures provide a basis in order to compare different approaches performing a certain task. Most importantly, several original and novel approaches performing a certain task have been presented and compared to other state-of-the-art approaches from the literature using the defined evaluation measures.
Concerning the concept hierarchy induction task, we have presented a novel approach based on Formal Concept Analysis, an original guided agglomerative clustering method as well as a combination approach for the induction of concept hierarchies from text. All the approaches have been evaluated and have been demonstrated to actually outperform current state-of-the-art methods. We have further introduced and discussed several approaches to learning attributes and relations. In particular, we have presented approaches to learn i) attributes, ii) the appropriate domain and range for relations, as well as iii) specific relations using a pattern-based approach. Several approaches to automatically populate an ontology with instances have also been described. We have in particular examined a similarity-based approach as well as introduced the original approach of Learning By Googling. Corresponding evaluations have also been provided. Finally, we have have also discussed applications for ontology learning approaches and demonstrated for two concrete applications that the techniques developed in the context of this book are indeed useful. Throughout the book, we have also provided a thorough overview of related work.
Fortunately, there are a number of open issues which require further research. On the one hand, though we have undertaken a first step towards combining different ontology learning paradigms via a machine-learning approach. further research is needed in this direction to unveil the full potential of such a combination. In particular, other paradigms than our classification-based approach could be explored. One could imagine to train classifiers for each type of basic ontological relation, i.e. isa, part-of, etc. using different methods and then use a calculus as envisioned by Heyer and colleagues [Heyer et al., 2001] as well as Ogata and Collier [Ogata and Collier, 2004] to combine the results of these classifiers and reason on different types of extracted ontological relations. Such a post-extraction reasoning is in fact crucial as the different approaches can produce contradicting information and thus producing a consistent ontology needs some kind of contradiction resolution approach. In fact, one important problem is to generate the optimal ontology maximizing a certain criterion given a certain amount of - possibly contradicting - relations. Initial blueprints for such an approach can be found, for example, in the work of Haase and Volker [Haase and Volker, 2005]. A lot of further research is however needed in this direction.
Another important issue to be clarified is which similarity measures, which weighting measures and which features work best for the task of clustering words. Though we have provided some insights in the present book, much more work is needed to clarify these issues. In the same vein, further experiments are necessary to clarify the relation between syntactic and semantic similarity such as perceived by humans. These issues can only be approached from an experimental perspective. Though there has been a lot of work on this issue, much further research can be expected.
In general, from a theoretical perspective, it would be necessary to clarify what type of ontologies we can actually learn, i.e. domain ontologies, lexical ontologies, upper-level ontologies, application ontologies, etc. Work in this direction has been presented by Bateman [Bateman, 1991], for instance. In this line, it seems also necessary to ask ourselves about the limits of ontology learning techniques. Furthermore, an integration of ontology learning techniques with linguistic theories, in particular with lexicon theories such as Generative Lexicon [Pustejovsky, 1991] is definitely desirable. In addition, it seems desirable to clarify the relation between ontological and lexical semantics. In the long term, it would definitely be interesting to acquire more complex relationships between concepts and relations in the form of rules or axioms. Last but not least, approaches should actually have reasonable applications. We have argued that it is far from straightforward to devise applications making use of automatically learned ontologies in a reasonable way. The problem lies in the fact that there are a number of parameters to be tuned on which the success of using an ontology depends. However, the quest for applications is a necessary and crucial one. Future research should thus further examine the usefulness of automatically derived knowledge structures for certain applications.
| Erscheint lt. Verlag | 11.12.2006 |
|---|---|
| Zusatzinfo | XXVIII, 347 p. |
| Verlagsort | New York |
| Sprache | englisch |
| Themenwelt | Mathematik / Informatik ► Informatik ► Datenbanken |
| Mathematik / Informatik ► Informatik ► Grafik / Design | |
| Mathematik / Informatik ► Informatik ► Netzwerke | |
| Informatik ► Theorie / Studium ► Künstliche Intelligenz / Robotik | |
| Mathematik / Informatik ► Informatik ► Web / Internet | |
| Technik | |
| Schlagworte | algorithms • Applications • Cimiano • classification • Computer • Computer Science • Evaluation • knowledge management • learning • Ontologies • Ontology • Population • semantic web • Text |
| ISBN-13 | 9780387392523 / 9780387392523 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.
Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich