From Protein Structure to Function with Bioinformatics (eBook)

eBook Download: PDF

2017 | 2. Auflage
XV, 509 Seiten
Springer Netherlands (Verlag)
978-94-024-1069-3 (ISBN)

The book is split into two broad sections, the first covering methods to generate or infer protein structure, the second dealing with structure-based function annotation. Each chapter is written by world experts in the field. The first section covers methods ranging from traditional homology modelling and fold recognition to fragment-based ab initio methods, and includes a chapter, new for the second edition, on structure prediction using evolutionary covariance. Membrane proteins and intrinsically disordered proteins are each assigned chapters, while two new chapters deal with amyloid structures and means to predict modes of protein-protein interaction. The second section includes chapters covering functional diversity within protein folds and means to assign function based on surface properties and recurring motifs. Further chapters cover the key roles of protein dynamics in protein function and use of automated servers for function inference. The book concludes with two chapters covering case studies of structure prediction, based respectively on crystal structures and protein models, providing numerous examples of real-world usage of the methods mentioned previously.

This book is targeted at postgraduate students and academic researchers. It is most obviously of interest to protein bioinformaticians and structural biologists, but should also serve as a guide to biologists more broadly by highlighting the insights that structural bioinformatics can provide into proteins of their interest.

Daniel Rigden is a Reader in post-genomic bioinformatics in the Institute of Integrative Biology. His interests span the broad relationships between protein sequences, structures and functions and how these evolve with time. As such, he applies a wide range of bioinformatics tools to diverse proteins of interest. This leads to interesting collaborations acorss the Institute and more broadly. A current prime interest is solution of crystal structures by Molecular Replacement using unconventional protein models, implemented in the program AMPLE.

This book is about protein structural bioinformatics and how it can help understand and predict protein function. It covers structure-based methods that can assign and explain protein function based on overall folds, characteristics of protein surfaces, occurrence of small 3D motifs, protein-protein interactions and on dynamic properties. Such methods help extract maximum value from new experimental structures, but can often be applied to protein models. The book also, therefore, provides comprehensive coverage of methods for predicting or inferring protein structure, covering all structural classes from globular proteins and their membrane-resident counterparts to amyloid structures and intrinsically disordered proteins. The book is split into two broad sections, the first covering methods to generate or infer protein structure, the second dealing with structure-based function annotation. Each chapter is written by world experts in the field. The first section covers methods ranging from traditional homology modelling and fold recognition to fragment-based ab initio methods, and includes a chapter, new for the second edition, on structure prediction using evolutionary covariance. Membrane proteins and intrinsically disordered proteins are each assigned chapters, while two new chapters deal with amyloid structures and means to predict modes of protein-protein interaction. The second section includes chapters covering functional diversity within protein folds and means to assign function based on surface properties and recurring motifs. Further chapters cover the key roles of protein dynamics in protein function and use of automated servers for function inference. The book concludes with two chapters covering case studies of structure prediction, based respectively on crystal structures and protein models, providing numerous examples of real-world usage of the methods mentioned previously.This book is targeted at postgraduate students and academic researchers. It is most obviously of interest to protein bioinformaticians and structural biologists, but should also serve as a guide to biologists more broadly by highlighting the insights that structural bioinformatics can provide into proteins of their interest.

Preface to the Second Edition 5
References 6
Contents 7
Generating and Inferring Structures 16
1 Ab Initio Protein Structure Prediction 17
Abstract 17
1.1 Introduction 18
1.2 Energy Functions 19
1.2.1 Physics-Based Energy Functions 21
1.2.2 Knowledge-Based Energy Function Combined with Fragments 25
1.3 Conformational Search Methods 32
1.3.1 Monte Carlo Simulations 32
1.3.2 Molecular Dynamics 33
1.3.3 Genetic Algorithm 34
1.3.4 Mathematical Optimization 35
1.4 Model Selection 35
1.4.1 Physics-Based Energy Function 36
1.4.2 Knowledge-Based Energy Function 37
1.4.3 Sequence-Structure Compatibility Function 38
1.4.4 Clustering of Decoy Structures 39
1.5 Remarks and Discussions 39
Acknowledgements 41
References 41
2 Protein Structures, Interactions and Function from Evolutionary Couplings 50
Abstract 50
2.1 Introduction 51
2.2 Evolutionary Couplings from Sequence Alignments 55
2.2.1 The Global Model 55
2.3 Three-Dimensional Protein Structures from Evolutionary Couplings 59
2.3.1 Transmembrane Proteins 61
2.3.2 Protein Interactions and Complexes 62
2.3.3 Conformational Plasticity and Disordered Proteins 64
2.4 Predicting the Effect of Mutations 65
2.5 Summary and Future Challenges 67
References 68
3 Fold Recognition 72
Abstract 72
3.1 Introduction 72
3.1.1 The Importance of Blind Trials: The CASP Competition 73
3.1.2 Ab Initio Structure Prediction Versus Homology Modelling 73
3.1.3 The Limits of Fold Space 75
3.2 Pushing Sequence Similarity to the Limits: The Power of Evolutionary Information 77
3.2.1 The Rise of Hidden Markov Models 80
3.2.2 Using Predicted Structural Features 81
3.2.3 Harnessing 3D Structure to Enhance Recognition 83
3.2.4 Knowledge-Based Potentials 83
3.2.5 Summary 85
3.3 CASP: The Great Filter 85
3.3.1 The Leaders 86
3.3.2 Individual Algorithms 86
3.3.3 Consensus Methods 88
3.4 Post-processing 89
3.4.1 Choosing and Combining Candidate Models 89
3.4.1.1 Clustering 90
3.4.1.2 Model Quality Assessment Programs (MQAPs) 90
3.4.1.3 Combining Models Optimally—Multiple Template Modelling 92
3.4.2 Post-processing in Practice 92
3.4.3 Use of Contacts 95
3.4.3.1 From Sequence to Profiles to Contact Maps 97
3.5 Tools for Fold Recognition on the Web 98
3.6 The Future 99
References 101
4 Comparative Protein Structure Modelling 104
Abstract 104
4.1 Introduction 104
4.1.1 Structure Determines Function 104
4.1.2 Sequences, Structures, Structural Genomics 105
4.1.3 Approaches to Protein Structure Prediction 107
4.2 Steps in Comparative Protein Structure Modelling 109
4.2.1 Searching for Structures Related to the Target Sequence 111
4.2.2 Selecting Templates 113
4.2.3 Sequence to Structure Alignment 115
4.2.4 Model Building 116
4.2.4.1 Template Dependent Modelling 116
4.2.4.2 Template Independent Modelling: Modelling Loops, Insertions 119
4.2.4.3 Refining Models 123
4.2.4.4 Hybrid Modelling of Proteins and Complexes with Experimental Restraints 124
4.2.5 Model Evaluation 127
4.3 Performance of Comparative Modelling 129
4.3.1 Accuracy of Methods 129
4.3.2 Errors in Comparative Models 130
4.4 Applications of Comparative Modelling 132
4.4.1 Modelling of Individual Proteins 132
4.4.2 Comparative Modelling and the Protein Structure Initiative 132
4.5 Summary 133
References 134
5 Advances in Computational Methods for Transmembrane Protein Structure Prediction 148
Abstract 148
5.1 Introduction 149
5.2 Membrane Protein Structural Classes 149
5.2.1 ?-Helical Bundles 150
5.2.2 Transmembrane ?-Barrels 150
5.3 Databases 152
5.4 Multiple Sequence Alignments 153
5.5 Transmembrane Protein Topology Prediction 154
5.5.1 Early ?-Helical Topology Prediction Approaches 155
5.5.2 Machine Learning Approaches for ?-Helical Topology Prediction 155
5.5.3 Signal Peptides and Re-entrant Helices 157
5.5.4 Consensus Approaches for ?-Helical Topology Prediction 158
5.5.5 Transmembrane ?-Barrel Topology Prediction 159
5.5.6 Empirical Approaches for ?-Barrel Topology Prediction 160
5.5.7 Machine Learning Approaches for ?-Barrel Topology Prediction 161
5.5.8 Consensus Approaches for ?-Barrel Topology Prediction 162
5.6 3D Structure Prediction 163
5.6.1 Homology Modelling of ?-Helical Transmembrane Proteins 163
5.6.2 Homology Modelling of Transmembrane ?-Barrel Proteins 164
5.6.3 De Novo Modelling of ?-Helical Transmembrane Proteins 165
5.6.4 De Novo Modelling of Transmembrane ?-Barrels 167
5.6.5 Covariation-Based Approaches 167
5.6.6 Evolutionary Covariation-Based Methods for De Novo Modelling of ?-Helical Membrane Proteins 168
5.6.7 Evolutionary Covariation-Based Methods for Transmembrane ?-Barrel Structure Prediction 170
5.7 Future Directions 171
Competing Interests 171
References 171
6 Bioinformatics Approaches to the Structure and Function of Intrinsically Disordered Proteins 179
Abstract 179
6.1 The Concept of Protein Disorder 180
6.2 Sequence Features of IDPs 181
6.2.1 The Unusual Amino Acid Composition of IDPs 181
6.2.2 Low Sequence Complexity and Disorder 181
6.2.3 Flavours of Disorder 182
6.3 Prediction of Disorder 183
6.3.1 Charge-Hydropathyhydrophobicity Plot 183
6.3.2 Propensity-Based Predictors 183
6.3.3 Prediction Based on Simplified Biophysical Models 186
6.3.4 Machine Learning Algorithms 187
6.3.5 Related Approaches for the Prediction of Protein Disorder 189
6.3.6 Comparison of Disorder Prediction Methods 190
6.4 Databases of IDPs 191
6.5 Structural Features of IDPs 192
6.6 Functional Classification of IDPs 193
6.6.1 Gene Ontology-Based Functional Classification of IDPs 194
6.6.2 Classification of IDPs Based on Their Mechanism of Action 195
6.6.2.1 Entropic Chains 196
6.6.2.2 Function by Transient Binding 196
6.6.2.3 Functions by Permanent Binding 197
6.6.3 Functional Features of IDPs 197
6.6.3.1 Short Linear motifs 198
6.6.3.2 Disordered Binding Regions/Molecular Recognition Features 199
6.6.3.3 Intrinsically Disordered Domains 199
6.7 Prediction of the Function of IDPs 200
6.7.1 Predicting Short Recognition Motifs in IDRs 202
6.7.2 Prediction of Disordered Binding Regions/MoRFs 203
6.7.3 Combination of Information on Sequence and Disorder: Phosphorylation Sites and CaM Binding Motifs 204
6.7.4 Correlation of Disorder Pattern and Function 205
6.8 Evolution of IDPs 206
6.9 Conclusions 207
Acknowledgements 207
References 207
7 Prediction of Protein Aggregation and Amyloid Formation 216
Abstract 216
7.1 Introduction 217
7.2 The Physico-chemical and Structural Basis of Protein Aggregation 217
7.2.1 Intrinsic Determinants of Protein Aggregation 224
7.2.2 Extrinsic Determinants of Protein Aggregation 225
7.2.3 Specific Sequence Stretches Drive Aggregation 225
7.2.4 Structural Determinants of Amyloid-like Aggregation 226
7.3 Prediction of Protein Aggregation from the Primary Sequence 227
7.3.1 Phenomenological Approaches 232
7.3.2 Structure-Based Approaches 236
7.3.3 Consensus Methods 241
7.3.4 Applications of Sequence-Based Predictors 243
7.3.4.1 Proteome-Wide Analyses 243
7.3.4.2 Prediction of in vivo Protein Aggregation 252
7.4 Prediction of Aggregation Propensity from the Tertiary Structure 253
7.5 Concluding Remarks 264
References 265
8 Prediction of Biomolecular Complexes 275
Abstract 275
8.1 Introduction 276
8.2 Docking 278
8.2.1 Step 1: Searching 279
8.2.2 Step 2: Scoring 280
8.2.3 Data-Driven Docking 284
8.3 The Challenges of Docking: Flexibility and Binding Affinity 285
8.3.1 Changes upon Binding: The Flexible Docking Challenge 285
8.3.2 The ‘Perfect’ Scoring Function and the Binding Affinity Problem 286
8.4 Protein-Peptide Docking 288
8.5 Post-docking: Interface Prediction from Docking Results and Use of Docking-Derived Contacts for Clustering and Ranking 289
8.5.1 Web Tools for the Post-docking Processing 291
8.6 Concluding Remarks 293
Acknowledgements 293
References 294
From Structures to Functions 303
9 Function Diversity Within Folds and Superfamilies 304
Abstract 304
9.1 Defining Function 305
9.2 From Fold to Function 306
9.2.1 Definition of a Fold 306
9.2.1.1 General Understanding 306
9.2.1.2 Practical Definitions 307
9.2.1.3 Paradigm Shift 308
9.2.2 Prediction of Function Using Fold Relationships 309
9.2.2.1 Folds with a Single Function 309
9.2.2.2 Supersites 310
9.2.2.3 Superfolds 312
9.3 Function Diversity Between Homologous Proteins 312
9.3.1 Definitions 312
9.3.1.1 General Understanding 312
9.3.1.2 Practical Definitions 313
9.3.2 Evolution of Protein Superfamilies 316
9.3.3 Function Divergence During Protein Evolution 317
9.3.3.1 Function Diversity at the Superfamily Level 318
9.3.3.2 Function Diversity Between Close Homologues 324
9.4 Conclusion 329
Bibliography 329
10 Function Prediction Using Patches, Pockets and Other Surface Properties 335
Abstract 335
10.1 Definitions of Protein Surfaces 336
10.2 Surface Patches 337
10.2.1 Hydrophobic Patches 337
10.2.2 Electrostatics 344
10.2.3 Sequence Conservation 346
10.2.4 Surface Atom Triplet Propensities 347
10.2.5 Multiple Properties 348
10.3 Pockets 348
10.3.1 Geometric Descriptions of Pockets 350
10.3.2 Channels and Tunnels 351
10.3.3 Distinguishing Functional Pockets 352
10.3.4 Predicting Ligands for Pockets 353
10.3.4.1 Pocket Matching 353
10.3.4.2 Docking for Function Prediction 354
10.4 Prediction of Catalytic Residues 355
10.5 Protein-Protein Interfaces 357
10.6 Other Specialised Binding Site Predictors 358
10.7 Medicinal Applications 360
10.8 Conclusions 361
References 362
11 3D Motifs 369
Abstract 369
11.1 Background: Functional Annotation 370
11.1.1 What Is Function? 371
11.1.2 Genomics and Functional Annotation 371
11.1.3 The Need for Structure-Based Methods 373
11.2 3D Motif Matching Techniques 374
11.2.1 What Is a 3D Motif? 374
11.2.2 Historical Development of Motif Matching Methods 377
11.3 Algorithmic Approaches to Motif Matching 381
11.3.1 Methods Using 3D Motifs 382
11.3.2 Efficiency Considerations for 3D Motifs 383
11.3.3 Methods with Nonstandard Motif Information 384
11.3.4 Interpretation of Results 385
11.4 Methods for Deriving Motifs 386
11.4.1 Literature Search and Manual Curation 387
11.4.2 Annotated Sites in PDB Structures 387
11.4.3 Mining for Emergent Properties 388
11.4.3.1 Undirected Mining 388
11.4.3.2 Directed Mining 389
11.4.3.3 Directed Mining with Positive and Negative Examples 390
11.5 Molecular Docking for Functional Annotation 391
11.6 Discussion and Conclusions 393
Acknowledgements 393
References 394
12 Protein Dynamics: From Structure to Function 401
Abstract 401
12.1 Molecular Dynamics Simulations 401
12.1.1 Principles and Approximations 402
12.1.2 Applications 404
12.1.2.1 Nuclear Transport Receptors 405
12.1.2.2 Lysozyme 406
12.1.2.3 Aquaporins 408
12.1.3 Limitations—Enhanced Sampling Algorithms 410
12.1.3.1 Replica Exchange 411
12.2 Principal Component Analysis 414
12.3 Collective Coordinate Sampling Algorithms 417
12.3.1 Essential Dynamics 417
12.3.2 TEE-REX 418
12.3.2.1 Applications: Finding Transition Pathways in Adenylate Kinase 419
12.4 Methods for Functional Mode Prediction 421
12.4.1 Normal Mode Analysis 421
12.4.2 Elastic Network Models 422
12.4.3 CONCOORD 423
12.4.3.1 Applications 423
12.5 Summary and Outlook 427
References 428
13 Integrated Servers for Structure-Informed Function Prediction 434
Abstract 434
13.1 Introduction 434
13.1.1 The Problem of Predicting Function from Structure 435
13.1.2 Structure-Function Prediction Methods 437
13.2 ProKnow 438
13.2.1 Fold Matching 439
13.2.2 3D Motifs 441
13.2.3 Sequence Homology 441
13.2.4 Sequence Motifs 441
13.2.5 Protein Interactions 441
13.2.6 Combining the Predictions 442
13.2.7 Prediction Success 442
13.3 ProFunc 443
13.3.1 ProFunc’s Structure-Based Methods 444
13.3.1.1 Fold-Matching 444
13.3.1.2 Surface Clefts 445
13.3.1.3 Nests 445
13.3.1.4 Template Methods 446
13.3.1.5 PDBsum Structural Analyses 449
13.3.2 Assessment of the Structural Methods 449
13.4 Conclusion 451
Acknowledgements 451
References 452
14 Case Studies: Function Predictions of Structural Genomics Results 456
Abstract 456
14.1 Introduction 456
14.2 Function Prediction Case Studies 458
14.2.1 Teichman et al. (2001) 458
14.2.2 Kim et al. (2003) 458
14.2.3 Watson et al. (2007) 460
14.2.4 Lee et al. (2011) 463
14.3 Some Specific Examples 463
14.3.1 Adams et al. (2007) 463
14.3.2 AF0491 Protein 464
14.3.3 The GxGYxYP Family 466
14.4 Community Annotation 467
14.5 Conclusions 468
Acknowledgements 469
References 469
15 Prediction of Protein Function from Theoretical Models 473
Abstract 473
15.1 Background 473
15.2 Suitability of Protein 3D Models for Structure-Based Predictions 475
15.2.1 Surface Properties 476
15.2.2 Functional Sites 478
15.2.3 Specific Binding Predictions 479
15.2.4 Small Molecule Binding 480
15.2.5 Protein-Protein Interactions 482
15.2.6 Protein Model Databases 483
15.3 Function Prediction Examples 484
15.3.1 Fold Prediction with Fragment-Based Ab Initio Models 484
15.3.2 Fold Prediction with Contact-Based Models 487
15.3.3 Plasticity of Catalytic Site Residues 489
15.3.4 Prediction of Ligand Specificity 490
15.3.5 Prediction of Cofactor Specificity Using an Entry from a Database of Models 491
15.3.6 Mutation Mapping 494
15.3.7 Protein Complexes 495
15.3.8 Structure Modelling of Alternatively Spliced Isoforms 496
15.3.9 From Broad Function to Molecular Details 497
15.4 Conclusions 499
References 499
Index 505

Erscheint lt. Verlag	6.4.2017
Zusatzinfo	XV, 503 p. 86 illus., 75 illus. in color.
Verlagsort	Dordrecht
Sprache	englisch
Themenwelt	Medizin / Pharmazie
	Naturwissenschaften ► Biologie ► Biochemie
	Naturwissenschaften ► Biologie ► Genetik / Molekularbiologie
	Naturwissenschaften ► Biologie ► Mikrobiologie / Immunologie
	Technik
Schlagworte	algorithms • Bioinformatics • classification • Databases • Gene Ontology • Secondary structure
ISBN-10	94-024-1069-4 / 9402410694
ISBN-13	978-94-024-1069-3 / 9789402410693

Haben Sie eine Frage zum Produkt?

PDF (Wasserzeichen)
Größe: 13,2 MB

DRM: Digitales Wasserzeichen
Dieses eBook enthält ein digitales Wasserzeichen und ist damit für Sie personalisiert. Bei einer missbräuchlichen Weitergabe des eBooks an Dritte ist eine Rückverfolgung an die Quelle möglich.

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen dafür einen PDF-Viewer - z.B. den Adobe Reader oder Adobe Digital Editions.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen dafür einen PDF-Viewer - z.B. die kostenlose Adobe Digital Editions-App.

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Print-Ausgabe

Buch | Hardcover

CHF 449,35