Zum Hauptinhalt springen
Nicht aus der Schweiz? Besuchen Sie lehmanns.de

Knowledge-Based Bioinformatics (eBook)

From Analysis to Interpretation

Gil Alterovitz, Marco Ramoni (Herausgeber)

eBook Download: PDF
2010
John Wiley & Sons (Verlag)
978-0-470-66970-9 (ISBN)

Lese- und Medienproben

Knowledge-Based Bioinformatics -
Systemvoraussetzungen
70,99 inkl. MwSt
(CHF 69,35)
Der eBook-Verkauf erfolgt durch die Lehmanns Media GmbH (Berlin) zum Preis in Euro inkl. MwSt.
  • Download sofort lieferbar
  • Zahlungsarten anzeigen
There is an increasing need throughout the biomedical sciences for a greater understanding of knowledge-based systems and their application to genomic and proteomic research. This book discusses knowledge-based and statistical approaches, along with applications in bioinformatics and systems biology. The text emphasizes the integration of different methods for analysing and interpreting biomedical data. This, in turn, can lead to breakthrough biomolecular discoveries, with applications in personalized medicine.

Key Features:

  • Explores the fundamentals and applications of knowledge-based and statistical approaches in bioinformatics and systems biology.
  • Helps readers to interpret genomic, proteomic, and metabolomic data in understanding complex biological molecules and their interactions.
  • Provides useful guidance on dealing with large datasets in knowledge bases, a common issue in bioinformatics.
  • Written by leading international experts in this field.

Students, researchers, and industry professionals with a background in biomedical sciences, mathematics, statistics, or computer science will benefit from this book. It will also be useful for readers worldwide who want to master the application of bioinformatics to real-world situations and understand biological problems that motivate algorithms.



Dr Gil Alterovitz, Harvard Medical School & Massachusetts Institute of Technology, USA
Dr Alterovitz regularly lectures on Bioinformatics and biomedical computing. He is the Editor of the successful Artech House book Systems Bioinformatics (2007)

Dr Marco Ramoni, Harvard Medical School & Massachusetts Institute of Technology, USA
Dr Ramoni is the Associate Director of Bioinformatics at Harvard Medical school. He has written numerous papers and book chapters on bioinformatics and is regularly invited to speak at conferences.


There is an increasing need throughout the biomedical sciences for a greater understanding of knowledge-based systems and their application to genomic and proteomic research. This book discusses knowledge-based and statistical approaches, along with applications in bioinformatics and systems biology. The text emphasizes the integration of different methods for analysing and interpreting biomedical data. This, in turn, can lead to breakthrough biomolecular discoveries, with applications in personalized medicine. Key Features: Explores the fundamentals and applications of knowledge-based and statistical approaches in bioinformatics and systems biology. Helps readers to interpret genomic, proteomic, and metabolomic data in understanding complex biological molecules and their interactions. Provides useful guidance on dealing with large datasets in knowledge bases, a common issue in bioinformatics. Written by leading international experts in this field. Students, researchers, and industry professionals with a background in biomedical sciences, mathematics, statistics, or computer science will benefit from this book. It will also be useful for readers worldwide who want to master the application of bioinformatics to real-world situations and understand biological problems that motivate algorithms.

Dr Gil Alterovitz, Harvard Medical School & Massachusetts Institute of Technology, USA Dr Alterovitz regularly lectures on Bioinformatics and biomedical computing. He is the Editor of the successful Artech House book Systems Bioinformatics (2007) Dr Marco Ramoni, Harvard Medical School & Massachusetts Institute of Technology, USA Dr Ramoni is the Associate Director of Bioinformatics at Harvard Medical school. He has written numerous papers and book chapters on bioinformatics and is regularly invited to speak at conferences.

Knowledge-Based Bioinformatics 3
Contents 7
Preface 15
List of Contributors 19
PART I FUNDAMENTALS 23
Section 1 Knowledge-Driven Approaches 25
1 Knowledge-based bioinformatics 27
1.1 Introduction 27
1.2 Formal reasoning for bioinformatics 29
1.3 Knowledge representations 32
1.4 Collecting explicit knowledge 32
1.5 Representing common knowledge 33
1.6 Capturing novel knowledge 37
1.7 Knowledge discovery applications 37
1.8 Semantic harmonization: the power and limitation of ontologies 40
1.9 Text mining and extraction 41
1.10 Gene expression 42
1.11 Pathways and mechanistic knowledge 44
1.12 Genotypes and phenotypes 46
1.13 The Web’s role in knowledge mining 47
1.14 New frontiers 48
1.14.1 Requirements for linked knowledge discovery 48
1.14.2 Information aggregation 48
1.14.3 The Linked Open Data initiative 50
1.14.4 Information articulation 50
1.14.5 Next-generation knowledge discovery 52
1.15 References 53
2 Knowledge-driven approaches to genome-scale analysis 55
2.1 Fundamentals 55
2.1.1 The genomic era and systems biology 55
2.1.2 The exponential growth of biomedical knowledge 56
2.1.3 The challenges of finding and interacting with biomedical knowledge 57
2.2 Challenges in knowledge-driven approaches 59
2.2.1 We need to read development of automatic methods to extract data housed in the biomedical literature
2.2.2 Implicit and implied knowledge the forgotten data source
2.2.3 Humans are visual beings: so should their knowledge be 64
2.3 Current knowledge-based bioinformatics tools 65
2.3.1 Enrichment tools 66
2.3.2 Integration and expansion: from gene lists to networks 68
2.3.3 Expanding the concept of an interaction 70
2.3.4 A systematic failure to support advanced scientific reasoning 72
2.4 3R systems: reading, reasoning and reporting the way towards biomedical discovery 72
2.4.1 3R knowledge networks populated by reading and reasoning 74
2.4.2 Implied association results in uncertainty 75
2.4.3 Reporting: using 3R knowledge networks to tell biological stories 76
2.5 The Hanalyzer: a proof of 3R concept 77
2.6 Acknowledgements 84
2.7 References 84
3 Technologies and best practices for building bio-ontologies 89
3.1 Introduction 89
3.2 Knowledge representation languages and tools for building bio-ontologies 90
3.2.1 RDF (resource description framework) 93
3.2.2 OWL (Web ontology language) 94
3.2.3 OBO format 99
3.3 Best practices for building bio-ontologies 100
3.3.1 Define the scope of the bio-ontology 100
3.3.2 Identity of the represented entities 101
3.3.3 Commit to agreed ontological principles 101
3.3.4 Knowledge acquisition 102
3.3.5 Ontology Design Patterns (ODPs) 102
3.3.6 Ontology evaluation 103
3.3.7 Documentation 105
3.4 Conclusion 105
3.5 Acknowledgements 106
3.6 References 106
4 Design, implementation and updating of knowledge bases 109
4.1 Introduction 109
4.2 Sources of data in bioinformatics knowledge bases 112
4.2.1 Data added by internal curators 112
4.2.2 Data submitted by external users and collaborators 112
4.2.3 Data added automatically 113
4.3 Design of knowledge bases 113
4.3.1 Understanding your end users and understanding their data 114
4.3.2 Interactions and interfaces: their impact on design 115
4.4 Implementation of knowledge bases 115
4.4.1 Choosing a database architecture 115
4.4.2 Good programming practices 118
4.4.3 Implementation of interfaces 119
4.5 Updating of knowledge bases 120
4.5.1 Manual curation and auto-annotation 120
4.5.2 Clever pipelines and data flows 123
4.5.3 Lessening data maintenance overheads 126
4.6 Conclusions 127
4.7 References 127
Section 2 Data-Analysis Approaches 129
5 Classical statistical learning in bioinformatics 131
5.1 Introduction 131
5.2 Significance testing 131
5.2.1 Multiple testing and false discovery rate 132
5.2.2 Correlated errors 133
5.3 Exploratory analysis 134
5.3.1 Clustering 134
5.3.2 Principal components 138
5.3.3 Multidimensional scaling (MDS) 139
5.4 Classification and prediction 141
5.4.1 Discriminant analysis 142
5.4.2 Modern procedures 142
5.5 References 144
6 Bayesian methods in genomics and proteomics studies 147
6.1 Introduction 147
6.2 Bayes theorem and some simple applications 148
6.3 Inference of population structure from genetic marker data 151
6.4 Inference of protein binding motifs from sequence data 152
6.5 Inference of transcriptional regulatory networks from joint analysis of protein–DNA binding data and gene expression data 153
6.6 Inference of protein and domain interactions from yeast two-hybrid data 154
6.7 Conclusions 156
6.8 Acknowledgements 157
6.9 References 157
7 Automatic text analysis for bioinformatics knowledge discovery 159
7.1 Introduction 159
7.1.1 Knowledge discovery through text mining 160
7.1.2 Need for processing biomedical texts 161
7.1.3 Developing text mining solutions 163
7.2 Information needs for biomedical text mining 164
7.2.1 Efficient analysis of normalized information 164
7.2.2 Interactive seeking of textual information 167
7.3 Principles of text mining 169
7.3.1 Components 169
7.3.2 Methods 172
7.4 Development issues 174
7.4.1 Information needs 175
7.4.2 Corpus construction 175
7.4.3 Language analysis 176
7.4.4 Integration framework 176
7.4.5 Evaluation 177
7.5 Success stories 178
7.5.1 Interactive literature analysis 178
7.5.2 Integration into bioinformatics solutions 179
7.5.3 Discovery of knowledge from the literature 180
7.6 Conclusion 181
7.7 References 182
PART II APPLICATIONS 191
Section 3 Gene and Protein Information 193
8 Fundamentals of gene ontology functional annotation 195
8.1 Introduction 195
8.1.1 Data submission curation 196
8.1.2 Value-added curation 196
8.2 Gene Ontology (GO) 197
8.2.1 Gene Ontology and the annotation of the humanproteome 197
8.2.2 Gene Ontology Consortium data sets 198
8.2.3 GO annotation methods 198
8.2.4 Different approaches to manual annotation 205
8.2.5 Ontology development 205
8.3 Comparative genomics and electronic protein annotation 208
8.3.1 Manual methods of transferring functional annotation 208
8.3.2 Electronic methods of transferring functional annotation 209
8.3.3 Electronic annotation methods 210
8.4 Community annotation 211
8.4.1 Feedback forms 212
8.4.2 Wiki pages 212
8.4.3 Community annotation workshops 212
8.5 Limitations 213
8.5.1 GO cannot capture all relevant biological aspects 213
8.5.2 The ontology is always evolving 214
8.5.3 The volume of literature 214
8.5.4 Missing published data 214
8.5.5 Manual curation is expensive 214
8.6 Accessing GO annotations 215
8.6.1 Tools for browsing the GO 216
8.6.2 Functional classification 221
8.6.3 GO slims 224
8.6.4 GO displays in other databases 225
8.7 Conclusions 225
8.8 References 226
9 Methods for improving genome annotation 231
9.1 The basis of gene annotation 231
9.1.1 Introduction to gene annotation 231
9.1.2 Progression in ab initio gene prediction 233
9.1.3 Annotation based on transcribed evidence 233
9.1.4 A comparison of annotation processes 235
9.1.5 The CCDS project 236
9.1.6 Pseudogene annotation 237
9.1.7 The annotation of non-coding genes 240
9.2 The impact of next generation sequencing on genome annotation 242
9.2.1 The annotation of multispecies genomes 242
9.2.2 Community annotation 244
9.2.3 Alternative splicing and new transcriptomics data 245
9.2.4 The annotation of human genome variation 247
9.2.5 The annotation of polymorphic gene families 248
9.3 References 250
10 Sequences from prokaryotic, eukaryotic, and viral genomes available clustered according to phylotype on a Self-Organizing Map 255
10.1 Introduction 255
10.2 Batch-learning SOM (BLSOM) adapted for genome informatics 257
10.3 Genome sequence analyses using BLSOM 259
10.3.1 BLSOMs for 13 eukaryotic genomes 259
10.3.2 Diagnostic oligonucleotides for phylotype-specific clustering 260
10.3.3 A large-scale BLSOM constructed with all sequences available from species-known genomes 262
10.3.4 Phylogenetic estimation for environmental DNA sequences and microbial community comparison using the BLSOM 264
10.3.5 Reassociation of environmental genomic fragments according to species 267
10.4 Conclusions and discussion 269
10.5 References 270
Section 4 Biomolecular Relationships and Meta-Relationships 273
11 Molecular network analysis and applications 275
11.1 Introduction 275
11.2 Topology analysis and applications 276
11.2.1 Global structure of molecular networks: scale-free, small-world, disassortative, and modular 276
11.2.2 Network statistics/measures 280
11.2.3 Applications of topology analysis 280
11.2.4 Challenges and future directions of topology analysis 284
11.3 Network motif analysis 285
11.3.1 Motif analysis: concept and method 285
11.3.2 Applications of motif analysis 285
11.3.3 Challenges and future directions of motif analysis 288
11.4 Network modular analysis and applications 289
11.4.1 Density-based clustering methods 290
11.4.2 Partition-based clustering methods 291
11.4.3 Centrality-based clustering methods 292
11.4.4 Hierarchical clustering methods 293
11.4.5 Applications of modular analysis 294
11.4.6 Challenges and future directions of modular analysis 295
11.5 Network comparison 296
11.5.1 Network comparison algorithms: from computer science to systems biology 296
11.5.2 Network comparison algorithms for molecular networks 297
11.5.3 Applications of molecular network comparison 299
11.5.4 Challenges and future directions of network comparison 300
11.6 Network analysis software and tools 301
11.7 Summary 301
11.8 Acknowledgement 304
11.9 References 304
12 Biological pathway analysis: an overview of Reactome and other integrative pathway knowledge bases 311
12.1 Biological pathway analysis and pathway knowledge bases 311
12.2 Overview of high-throughput data capture technologies and data repositories 312
12.3 Brief review of selected pathway knowledge bases 315
12.3.1 Reactome 315
12.3.2 KEGG 318
12.3.3 WikiPathways 319
12.3.4 NCI-Pathway Interaction Database 319
12.3.5 NCBI-BioSystems 320
12.3.6 Science Signaling 321
12.3.7 PharmGKB 321
12.4 How does information get into pathway knowledge bases? 322
12.5 Introduction to data exchange languages 323
12.5.1 SBML 323
12.5.2 BioPAX 324
12.5.3 PSI MI 325
12.5.4 Comparison of data exchange formats for different pathway knowledge bases 325
12.6 Visualization tools 326
12.7 Use case: pathway analysis in Reactome using statistical analysis of high-throughput data sets 327
12.8 Discussion: challenges and future directions of pathway knowledge bases 332
12.9 References 333
13 Methods and challenges of identifying biomolecular relationships and networks associated with complex diseases/phenotypes, and their application to drug treatments 337
13.1 Complex traits: clinical phenomenology and molecular background 337
13.2 Why it is challenging to infer relationships between genes and phenotypes in complex traits? 339
13.3 Bottom-up or top-down: which approach is more useful in delineating complex traits key drivers? 347
13.4 High-throughput technologies and their applications in complex traits genetics 349
13.5 Integrative systems biology: a comprehensive approach to mining high-throughput data 350
13.6 Methods applying systems biology approach in the identification of functional relationships from gene expression data 353
13.6.1 Methods using quantitative expression data to identify correlations in expression between genes (clustering) 356
13.6.2 Methods integrating functional genomics into cellular functional classes 360
13.6.3 Methods combining functional genomics results and existing biological information to construct novel biological networks 365
13.7 Advantages of networks exploration in molecular biology and drug discovery 375
13.8 Practical examples of applying systems biology approaches and network exploration in the identi.cation of functional modules and disease-causing genes in complex phenotypes/diseases 376
13.9 Challenges and future directions 385
13.10 References 386
Trends and conclusion 389
Index 391

Erscheint lt. Verlag 1.7.2010
Sprache englisch
Themenwelt Mathematik / Informatik Mathematik Statistik
Mathematik / Informatik Mathematik Wahrscheinlichkeit / Kombinatorik
Medizin / Pharmazie Allgemeines / Lexika
Naturwissenschaften Biologie
Technik Umwelttechnik / Biotechnologie
Schlagworte analysing • application • Applications • Bioinformatics • Bioinformatics & Computational Biology • Bioinformatik • Bioinformatik u. Computersimulationen in der Biowissenschaften • biomedical • biomedical engineering • Biomedizintechnik • biomolecular discoveries • Biostatistics • Biostatistik • Biowissenschaften • Book • breakthrough • Data • Different • emphasizes • greater • increasing • Integration • knowledgebased • Life Sciences • Medical Informatics & Biomedical Information Technology • Medicine • Medizininformatik u. biomedizinische Informationstechnologie • Methods • Sciences • Statistics • Statistik • Understanding
ISBN-10 0-470-66970-5 / 0470669705
ISBN-13 978-0-470-66970-9 / 9780470669709
Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?
PDFPDF (Adobe DRM)

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seiten­layout eignet sich die PDF besonders für Fach­bücher mit Spalten, Tabellen und Abbild­ungen. Eine PDF kann auf fast allen Geräten ange­zeigt werden, ist aber für kleine Displays (Smart­phone, eReader) nur einge­schränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Mehr entdecken
aus dem Bereich