Chemometrics (eBook)
John Wiley & Sons (Verlag)
978-1-118-90468-8 (ISBN)
A new, full-color, completely updated edition of the key practical guide to chemometrics
This new edition of this practical guide on chemometrics, emphasizes the principles and applications behind the main ideas in the field using numerical and graphical examples, which can then be applied to a wide variety of problems in chemistry, biology, chemical engineering, and allied disciplines. Presented in full color, it features expansion of the principal component analysis, classification, multivariate evolutionary signal and statistical distributions sections, and new case studies in metabolomics, as well as extensive updates throughout. Aimed at the large number of users of chemometrics, it includes extensive worked problems and chapters explaining how to analyze datasets, in addition to updated descriptions of how to apply Excel and Matlab for chemometrics.
Chemometrics: Data Driven Extraction for Science, Second Edition offers chapters covering: experimental design, signal processing, pattern recognition, calibration, and evolutionary data. The pattern recognition chapter from the first edition is divided into two separate ones: Principal Component Analysis/Cluster Analysis, and Classification. It also includes new descriptions of Alternating Least Squares (ALS) and Iterative Target Transformation Factor Analysis (ITTFA). Updated descriptions of wavelets and Bayesian methods are included.
- Includes updated chapters of the classic chemometric methods (e.g. experimental design, signal processing, etc.)
- Introduces metabolomics-type examples alongside those from analytical chemistry
- Features problems at the end of each chapter to illustrate the broad applicability of the methods in different fields
- Supplemented with data sets and solutions to the problems on a dedicated website, www.booksupport.wiley.com
Chemometrics: Data Driven Extraction for Science, Second Edition is recommended for post-graduate students of chemometrics as well as applied scientists (e.g. chemists, biochemists, engineers, statisticians) working in all areas of data analysis.
RICHARD G. BRERETON is Director of Brereton Consultancy and Emeritus Professor at the University of Bristol, UK. He is Fellow of the Royal Society of Chemistry, Royal Statistical Society and Royal Society of Medicine. He has applied chemometrics in a wide variety of areas including pharmaceuticals, materials, metabolomics, heritage studies and forensics, and has published over 400 articles, including writing/editing eight books.
A new, full-color, completely updated edition of the key practical guide to chemometrics This new edition of this practical guide on chemometrics, emphasizes the principles and applications behind the main ideas in the field using numerical and graphical examples, which can then be applied to a wide variety of problems in chemistry, biology, chemical engineering, and allied disciplines. Presented in full color, it features expansion of the principal component analysis, classification, multivariate evolutionary signal and statistical distributions sections, and new case studies in metabolomics, as well as extensive updates throughout. Aimed at the large number of users of chemometrics, it includes extensive worked problems and chapters explaining how to analyze datasets, in addition to updated descriptions of how to apply Excel and Matlab for chemometrics. Chemometrics: Data Driven Extraction for Science, Second Edition offers chapters covering: experimental design, signal processing, pattern recognition, calibration, and evolutionary data. The pattern recognition chapter from the first edition is divided into two separate ones: Principal Component Analysis/Cluster Analysis, and Classification. It also includes new descriptions of Alternating Least Squares (ALS) and Iterative Target Transformation Factor Analysis (ITTFA). Updated descriptions of wavelets and Bayesian methods are included. Includes updated chapters of the classic chemometric methods (e.g. experimental design, signal processing, etc.) Introduces metabolomics-type examples alongside those from analytical chemistry Features problems at the end of each chapter to illustrate the broad applicability of the methods in different fields Supplemented with data sets and solutions to the problems on a dedicated website, www.booksupport.wiley.com Chemometrics: Data Driven Extraction for Science, Second Edition is recommended for post-graduate students of chemometrics as well as applied scientists (e.g. chemists, biochemists, engineers, statisticians) working in all areas of data analysis.
RICHARD G. BRERETON is Director of Brereton Consultancy and Emeritus Professor at the University of Bristol, UK. He is Fellow of the Royal Society of Chemistry, Royal Statistical Society and Royal Society of Medicine. He has applied chemometrics in a wide variety of areas including pharmaceuticals, materials, metabolomics, heritage studies and forensics, and has published over 400 articles, including writing/editing eight books.
Contents1 Introduction
1.1 Historical Parentage
1.2 Developments since the 1970s
1.3 Software and Calculations
1.4 Further Reading
References
2 Experimental Design
2.1 Introduction
2.2 Basic Principles
2.3 Factorial Designs
2.4 Central Composite or Response Surface Designs
2.5 Mixture Designs
2.6 Simplex Optimisation
Problems
3 Signal Processing
3.1 Introduction
3.2 Basics
3.3 Linear Filters
3.4 Correlograms and Time Series Analysis
3.5 Fourier Transform Techniques
3.6 Additional Methods
Problems
4 Principal Component Analysis and Unsupervised Pattern Recognition
4.1 Introduction
4.2 The Concept and Need for Principal Components Analysis
4.3 Principal Components Analysis: The Method
4.4 Factor Analysis
4.5 Graphical Representation of Scores and Loadings
4.6 Pre-processing
4.7 Comparing Multivariate Patterns
4.8 Unsupervised Pattern Recognition: Cluster Analysis
4.9 Multi-way Pattern Recognition
Problems
5 Classification and Supervised Pattern Recognition
5.1 Introduction
5.2 Two-Class Classifiers
5.3 One-Class Classifiers
5.4 Multi-Class Classifiers
5.5 Optimisation and Validation
5.6 Significant Variables
Problems
6 Calibration
6.1 Introduction
6.2 Univariate Calibration
6.3 Multiple Linear Regression
6.4 Principal Components Regression
6.5 Partial Least Squares Regression
6.6 Model Validation and Optimisation
Problems
7 Evolutionary Multivariate Signals
7.1 Introduction
7.2 Exploratory Data Analysis and Pre-processing
7.3 Determining Composition
7.4 Resolution
Problems
A Appendix
A.1 Vectors and Matrices
A.2 Algorithms
A.3 Basic Statistical Concepts
A.4 Excel for Chemometrics
A.5 Matlab for Chemometrics
"... fills a gap in the chemometrics literature landscape. With its unique approach of learning-by-doing it is best suited for practitioners, which do not want to dig too deep into the theory and are not interested in a full coverage of methods. Nevertheless, the most important and usual applied chemometrics methods are introduced... The example data sets of the book are also worth exploring by itself, because they are well chosen and nicely structured."
--Thomas Bocklitz, Analytical and Bioanalytical Chemistry (2019)
Chapter 1
Introduction
1.1 Historical Parentage
There are many opinions about the origin of chemometrics. Until quite recently, the birth of chemometrics was considered to have happened in the 1970s. Its name first appeared in 1972 in an article by Svante Wold [1]: in fact, the topic of this article was not one that we would recognise as being core to chemometrics, being relevant to neither multivariate analysis nor experimental design. For over a decade, the word chemometrics was considered to be of very low profile, and it developed a recognisable presence only in the 1980s, as described below.
However, if an explorer describes a new species in a forest, the species was there long before the explorer. Thus, the naming of the discipline just recognises that it had reached some level of visibility and maturity. As people re-evaluate the origins of chemometrics, the birth can be traced many years back.
Chemometrics burst into the world due to three fundamental factors, applied statistics (multivariate and experimental design), statistics in analytical and physical chemistry, and scientific computing.
1.1.1 Applied Statistics
The ideas of multivariate statistics have been around a long time. R.A. Fisher and colleagues working in Rothamsted, UK, formalised many of our modern ideas while applying primarily to agriculture. In the UK, before the First World War, many of the upper classes owned extensive land and relied on their income from tenant farmers and agricultural labourers. After the First World War, the cost of labour became higher, with many moving to the cities, and there was stronger competition of food from global imports. This meant that historic agricultural practices were seen to be inefficient and it was hard for landowners (or companies that took over large estates) to be economic and competitive, hence a huge emphasis on agricultural research, including statistics to improve these. R.A. Fisher and co-workers published some of the first major books and papers that we would regard as defining modern statistical thinking [2, 3], introducing ideas ranging from the null hypothesis to discriminant analysis to ANOVA. Some of the work of Fisher followed from the pioneering work of Karl Pearson in the University College London who had founded the world's first statistics department previously and had first formulated ideas such as p values and correlation coefficients.
During the 1920s and 1930s, a number of important pioneers of multivariate statistics published their work, many strongly influenced or having worked with Fisher, including Harold Hotelling, credited by many as defining principal components analysis (PCA) [4], although Pearson had independently described this method some 30 years ago, but under a different guise. As so often ideas are reported several times over in science, it is the person that names it and popularises it that often gets the credit: in the early twentieth century, libraries were often localised and there were very few international journals (Hotelling working mainly in the US) and certainly no internet; therefore, parallel work was often reported.
The principles of statistical experimental design were also formulated at around this period. There had been early reports on what we regard as modern approaches to formal designs before that, for example James Lind's work on scurvy in the eighteenth century and Charles Pierce's discussion on randomised trials in the nineteenth century, but Fisher's classic work of the 1930s put all the concepts together in a rigorous statistical format [5].
Much non-Bayesian, applied statistical thinking has been based on principles established in the 1920s and 1930s, for nearly a century. Early applications include agriculture, psychology, finance and genetics. After the Second World War, the chemical industry took an interest. In the 1920s, an important need was to improve agricultural practice, but by the 1950s, a major need was to improve processes in manufacturing, especially chemical engineering; hence, many more statisticians were employed within the industry. O.L. Davies edited an important book on experimental design with contributions from colleagues in ICI [6]. Foremost was G.E.P. Box, son-in-law of Fisher, whose book with colleagues is one of the most important post-war classics in experimental design and multi-linear regression [7].
These statistical building blocks were already mature by the time people started calling themselves chemometricians and have changed only a little during the intervening period.
1.1.2 Statistics in Analytical and Physical Chemistry
Statistical methods, for example, to estimate accuracy and precision of measurements or to determine a best-fit linear relationship between two variables, have been available to analytical and physical chemists for over a century. Almost every general analytical textbook includes chapters on univariate statistics and has done for decades. Although theoretically we could view this as applied statistics, on the whole, the people who advanced statistics in analytical chemistry did not class themselves as applied statisticians and specialist terminology has developed over time.
Most quantitative analytical and physical chemistry until the 1970s was viewed as a univariate field; that is, only one independent variable was measured in an experiment. Usually, all other external factors were kept constant. This approach worked well in mechanics or fundamental physics, the so-called ‘One Factor at a Time’ (OFAT) approach. Hence, statistical methods were primarily used for univariate analysis of data. By the late 1940s, some analytical chemists were aware of ANOVA, F-tests and linear regression [8], although the term chemometrics had not been invented, but multivariate data came along much later.
There would have been very limited cross-fertilisation between applied statisticians, working in mathematics departments, and analytical chemists in chemistry departments, during these early days. Different departments often had different buildings, different libraries and different textbooks. A chemist, however numerate, would feel a stranger walking into a maths building and would probably cocoon him or herself in their own library. There was no such thing as the Internet or Web or Knowledge or electronic journals. Maths journals published papers for mathematicians and vice versa for chemistry journals. Although in areas such as agriculture and psychology there was a tradition of consulting statisticians, chemists were numerate and tended to talk to each other – an experimental chemist wanting to fit a straight line would talk to a physical chemist in the tea room if need be. Hence, ideas did not travel in academia. Industry was somewhat more pragmatic, but even there, the main statistical innovations were in chemical engineering and process chemistry and often classed as industrial chemistry. The top Universities often did not teach or research industrial chemistry, although they did teach Newtonian physics and relativity. In fact, the treatment of variables and errors by physicists trying, for example, to measure gravitational effects or the distance of a star is quite different to multivariate statistics: the former try to design experiments so that only one factor is studied and to make sure any errors are minimised and from one source, whereas a multivariate statistician might accept and expect data to be multifactorial.
Hence, statistics in analytical chemistry diverged from applied statistics for many decades. Caulcutt and Body's book first published in 1983 contains nothing on multivariate statistics [9] and in Miller and Miller's book of 1993 just one out of six main chapters is devoted to experimental design, optimisation and pattern recognition (including PCA) [10].
Even now, there are numerous useful books aimed at analytical and physical chemists that omit multivariate statistics. An elaborate vocabulary has developed for the needs of analytical chemists, with specialist concepts that are rarely encountered in other areas. Some analytical chemists in the 1960s to 1980s were aware that multivariate approaches existed and did venture into chemometrics, but good multivariate data were limited. Most are aware of ANOVA and experimental design. However, statistics for analytical chemistry tends to lead a separate existence from chemometrics, although multivariate methods derived from chemometrics do have a small foothold within most graduate-level courses and books in general analytical chemistry, and certainly quantitative analytical (and physical) chemistry was an important building block for modern chemometrics.
Over the last two decades, however, applications of chemometrics have moved far beyond traditional quantitative analytical chemistry, for example, into the areas of metabolomics, environment, cultural heritage or food, where the outcome is not necessarily to measure accurately the concentration of an analyte or how many compounds are in the spectra of a series of mixtures. This means that the aim of some chemometric analysis has changed. We often do not always have, for example, well-established reference samples and, in many cases, we cannot judge a method by how efficiently it predicts properties of these reference samples. We may not know whether the spectra of some extracts of urine samples can contain enough information to tell whether our donors are diseased or not: it may depend on how the disease has progressed, how good the diagnosis is, what the genetics of the donor and so on. Hence, we may never have a model that perfectly distinguishes two groups of samples. In classical physical or...
| Erscheint lt. Verlag | 13.3.2018 |
|---|---|
| Sprache | englisch |
| Themenwelt | Mathematik / Informatik ► Mathematik |
| Naturwissenschaften ► Biologie | |
| Naturwissenschaften ► Chemie | |
| Schlagworte | alternating least squares (ALS) • Analytical Chemistry • analyze chemometrics datasets using Excel • analyze chemometrics datasets using Matlab • Bioinformatics & Computational Biology • Bioinformatik u. Computersimulationen in der Biowissenschaften • Biowissenschaften • Chemie • Chemistry • Chemometrics • Chemometrics & Data Handling • chemometrics and biological pattern recognition • chemometrics and evolutionary data • chemometrics and pharmaceutical sciences • chemometrics applications • chemometrics calibration • chemometrics classification • Chemometrics: Data Analysis for the Laboratory and Chemical Plant • chemometrics experimental design • chemometrics pattern recognition • chemometrics research • chemometrics signal processing • chemometrics workbook • Chemometrie u. Datenverarbeitung • classification • cluster analysis • Data Analysis • evolutionary signal • Experimental Design • Factor Analysis • guide to chemometrics • iterative target transformation factor analysis (ITTFA) • Life Sciences • MATLAB • Metabolomics • metabolomics-type examples of chemometrics • Multivariate Analyse • multivariate analysis • Multivariate calibration • Multivariate Statistics • pattern recognition • Principal Component Analysis (PCA) • principles of chemometrics • signal deconvolution • signal resolution • Statistics • statistics and chemometrics • Statistik |
| ISBN-10 | 1-118-90468-0 / 1118904680 |
| ISBN-13 | 978-1-118-90468-8 / 9781118904688 |
| Informationen gemäß Produktsicherheitsverordnung (GPSR) | |
| Haben Sie eine Frage zum Produkt? |
Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM
Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.
Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine
Geräteliste und zusätzliche Hinweise
Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.
aus dem Bereich