Nonlinear Parameter Optimization Using R Tools - John C. Nash

Blick ins Buch

Nonlinear Parameter Optimization Using R Tools (eBook)

John C. Nash (Autor)

eBook Download: PDF | EPUB

2014
John Wiley & Sons (Verlag)
978-1-118-88396-9 (ISBN)

Lese- und Medienproben

Ebook-Leseprobe (EPUB)

The aim of this book is to provide an appreciation of the R tools available for optimization problems. Most users of R are not specialists in computation and the workings of the specialized tools are a black box. This can lead to mis-application, therefore users need help in making appropriate choices.This book looks at the principal tools available for users of the R statistical computing system for function minimization, optimization, and nonlinear parameter determination, featuring numerous examples throughout.

JOHN C. NASH, Telfer School of Management, University of Ottawa, Canada

Cover 1
Title Page 5
Copyright 6
Contents 9
Preface 17
Chapter 1 Optimization problem tasks and how they arise 19
1.1 The general optimization problem 19
1.2 Why the general problem is generally uninteresting 20
1.3 (Non-)Linearity 22
1.4 Objective function properties 22
1.4.1 Sums of squares 22
1.4.2 Minimax approximation 23
1.4.3 Problems with multiple minima 23
1.4.4 Objectives that can only be imprecisely computed 23
1.5 Constraint types 23
1.6 Solving sets of equations 24
1.7 Conditions for optimality 25
1.8 Other classifications 25
References 26
Chapter 2 Optimization algorithms-an overview 27
2.1 Methods that use the gradient 27
2.2 Newton-like methods 30
2.3 The promise of Newton's method 31
2.4 Caution: convergence versus termination 32
2.5 Difficulties with Newton's method 32
2.6 Least squares: Gauss-Newton methods 33
2.7 Quasi-Newton or variable metric method 35
2.8 Conjugate gradient and related methods 36
2.9 Other gradient methods 37
2.10 Derivative-free methods 37
2.10.1 Numerical approximation of gradients 37
2.10.2 Approximate and descend 37
2.10.3 Heuristic search 38
2.11 Stochastic methods 38
2.12 Constraint-based methods-mathematical programming 39
References 40
Chapter 3 Software structure and interfaces 43
3.1 Perspective 43
3.2 Issues of choice 44
3.3 Software issues 45
3.4 Specifying the objective and constraints to the optimizer 46
3.5 Communicating exogenous data to problem definition functions 46
3.5.1 Use of "global'' data and variables 49
3.6 Masked (temporarily fixed) optimization parameters 50
3.7 Dealing with inadmissible results 51
3.8 Providing derivatives for functions 52
3.9 Derivative approximations when there are constraints 54
3.10 Scaling of parameters and function 54
3.11 Normal ending of computations 54
3.12 Termination tests-abnormal ending 55
3.13 Output to monitor progress of calculations 55
3.14 Output of the optimization results 56
3.15 Controls for the optimizer 56
3.16 Default control settings 57
3.17 Measuring performance 57
3.18 The optimization interface 57
References 58
Chapter 4 One-parameter root-finding problems 59
4.1 Roots 59
4.2 Equations in one variable 60
4.3 Some examples 60
4.3.1 Exponentially speaking 60
4.3.2 A normal concern 62
4.3.3 Little Polly Nomial 64
4.3.4 A hypothequial question 67
4.4 Approaches to solving 1D root-finding problems 69
4.5 What can go wrong? 70
4.6 Being a smart user of root-finding programs 72
4.7 Conclusions and extensions 72
References 73
Chapter 5 One-parameter minimization problems 74
5.1 The optimize() function 74
5.2 Using a root-finder 75
5.3 But where is the minimum? 76
5.4 Ideas for 1D minimizers 77
5.5 The line-search subproblem 79
References 80
Chapter 6 Nonlinear least squares 81
6.1 nls() from package stats 81
6.1.1 A simple example 81
6.1.2 Regression versus least squares 83
6.2 A more difficult case 83
6.3 The structure of the nls() solution 90
6.4 Concerns with nls() 91
6.4.1 Small residuals 92
6.4.2 Robustness-"singular gradient'' woes 93
6.4.3 Bounds with nls() 95
6.5 Some ancillary tools for nonlinear least squares 97
6.5.1 Starting values and self-starting problems 97
6.5.2 Converting model expressions to sum-of-squares functions 98
6.5.3 Help for nonlinear regression 98
6.6 Minimizing R,functions that compute sums of squares 99
6.7 Choosing an approach 100
6.8 Separable sums of squares problems 104
6.9 Strategies for nonlinear least squares 111
References 111
Chapter 7 Nonlinear equations 113
7.1 Packages and methods for nonlinear equations 113
7.1.1 BB 114
7.1.2 nleqslv 114
7.1.3 Using nonlinear least squares 114
7.1.4 Using function minimization methods 114
7.2 A simple example to compare approaches 115
7.3 A statistical example 121
References 124
Chapter 8 Function minimization tools in the base R system 126
8.1 optim() 126
8.2 nlm() 128
8.3 nlminb() 129
8.4 Using the base optimization tools 130
References 132
Chapter 9 Add-in function minimization packages for R 133
9.1 Package optimx 133
9.1.1 Optimizers in optimx 134
9.1.2 Example use of optimx() 135
9.2 Some other function minimization packages 136
9.2.1 nloptr and nloptwrap 136
9.2.2 trust and trustOptim 137
9.3 Should we replace optim() routines? 139
References 140
Chapter 10 Calculating and using derivatives 141
10.1 Why and how 141
10.2 Analytic derivatives-by hand 142
10.3 Analytic derivatives-tools 143
10.4 Examples of use of R tools for differentiation 143
10.5 Simple numerical derivatives 145
10.6 Improved numerical derivative approximations 146
10.6.1 The Richardson extrapolation 146
10.6.2 Complex-step derivative approximations 146
10.7 Strategy and tactics for derivatives 147
References 149
Chapter 11 Bounds constraints 150
11.1 Single bound: use of a logarithmic transformation 150
11.2 Interval bounds: Use of a hyperbolic transformation 151
11.2.1 Example of the tanh transformation 152
11.2.2 A fly in the ointment 152
11.3 Setting the objective large when bounds are violated 153
11.4 An active set approach 154
11.5 Checking bounds 156
11.6 The importance of using bounds intelligently 156
11.6.1 Difficulties in applying bounds constraints 157
11.7 Post-solution information for bounded problems 157
Appendix 11.A Function transfinite 159
References 160
Chapter 12 Using masks 161
12.1 An example 161
12.2 Specifying the objective 161
12.3 Masks for nonlinear least squares 165
12.4 Other approaches to masks 166
References 166
Chapter 13 Handling general constraints 167
13.1 Equality constraints 167
13.1.1 Parameter elimination 169
13.1.2 Which parameter to eliminate? 171
13.1.3 Scaling and centering? 172
13.1.4 Nonlinear programming packages 172
13.1.5 Sequential application of an increasing penalty 174
13.2 Sumscale problems 176
13.2.1 Using a projection 180
13.3 Inequality constraints 181
13.4 A perspective on penalty function ideas 185
13.5 Assessment 185
References 186
Chapter 14 Applications of mathematical programming 187
14.1 Statistical applications of math programming 187
14.2 R packages for math programming 188
14.3 Example problem: L1 regression 189
14.4 Example problem: minimax regression 195
14.5 Nonlinear quantile regression 197
14.6 Polynomial approximation 198
References 201
Chapter 15 Global optimization and stochastic methods 203
15.1 Panorama of methods 203
15.2 R packages for global and stochastic optimization 204
15.3 An example problem 205
15.3.1 Method SANN from optim() 205
15.3.2 Package GenSA 206
15.3.3 Packages DEoptim and RcppDE 207
15.3.4 Package smco 209
15.3.5 Package soma 210
15.3.6 Package Rmalschains 211
15.3.7 Package rgenoud 211
15.3.8 Package GA 212
15.3.9 Package gaoptim 213
15.4 Multiple starting values 214
References 220
Chapter 16 Scaling and reparameterization 221
16.1 Why scale or reparameterize? 221
16.2 Formalities of scaling and reparameterization 222
16.3 Hobbs' weed infestation example 223
16.4 The KKT conditions and scaling 228
16.5 Reparameterization of the weeds problem 232
16.6 Scale change across the parameter space 232
16.7 Robustness of methods to starting points 233
16.7.1 Robustness of optimization techniques 236
16.7.2 Robustness of nonlinear least squares methods 238
16.8 Strategies for scaling 240
References 241
Chapter 17 Finding the right solution 242
17.1 Particular requirements 242
17.1.1 A few integer parameters 243
17.2 Starting values for iterative methods 243
17.3 KKT conditions 244
17.3.1 Unconstrained problems 244
17.3.2 Constrained problems 245
17.4 Search tests 246
References 247
Chapter 18 Tuning and terminating methods 248
18.1 Timing and profiling 248
18.1.1 rbenchmark 249
18.1.2 microbenchmark 249
18.1.3 Calibrating our timings 250
18.2 Profiling 252
18.2.1 Trying possible improvements 253
18.3 More speedups of R computations 256
18.3.1 Byte-code compiled functions 256
18.3.2 Avoiding loops 256
18.3.3 Package upgrades - an example 257
18.3.4 Specializing codes 259
18.4 External language compiled functions 260
18.4.1 Building an R function using Fortran 262
18.4.2 Summary of Rayleigh quotient timings 264
18.5 Deciding when we are finished 265
18.5.1 Tests for things gone wrong 266
References 267
Chapter 19 Linking R to external optimization tools 268
19.1 Mechanisms to link R to external software 269
19.1.1 R functions to call external (sub)programs 269
19.1.2 File and system call methods 269
19.1.3 Thin client methods 270
19.2 Prepackaged links to external optimization tools 270
19.2.1 NEOS 270
19.2.2 Automatic Differentiation Model Builder (ADMB) 270
19.2.3 NLopt 271
19.2.4 BUGS and related tools 271
19.3 Strategy for using external tools 271
References 272
Chapter 20 Differential equation models 273
20.1 The model 273
20.2 Background 274
20.3 The likelihood function 276
20.4 A first try at minimization 276
20.5 Attempts with optimx 277
20.6 Using nonlinear least squares 278
20.7 Commentary 279
Reference 280
Chapter 21 Miscellaneous nonlinear estimation tools for R 281
21.1 Maximum likelihood 281
21.2 Generalized nonlinear models 284
21.3 Systems of equations 286
21.4 Additional nonlinear least squares tools 286
21.5 Nonnegative least squares 288
21.6 Noisy objective functions 291
21.7 Moving forward 292
References 293
Index 297

"The book chapters are enriched by little anecdotes, and the reader obviously benefits from John C. Nash's experience of more than 30 years in the field of nonlinear optimization. This experience translates into many practical recommendations and tweaks. The book provides plenty of code examples and useful code snippets." (Biometrical Journal, 2016)

Chapter 1
Optimization problem tasks and how they arise

In this introductory chapter we look at the classes of problems for which we will discuss solution tools. We also consider the interrelationships between different problem classes as well as among the solution methods. This is quite general. R is only incidental to this chapter except for some examples. Here we write our list of things to do.

1.1 The general optimization problem

The general constrained optimization problem can be stated as follows.

Find x = (x)
such that
(x)>= 0

Note that is a scalar function but is a vector. There may or may not be constraints on the values of , and these are expressed formally in the vector of functions . While these functions are general, many problems have much simpler constraints, such as requirements that the values of be no less than some lower bounds or no greater than some upper bounds as we shall discuss in the following text.

We have specified the problem as a minimization, but maximization problems can be transformed to minimizations by multiplying the objective function by .

Note also that we have asked for the set of arguments x that minimize the objective, which essentially implies the global minimum. However, many—if not most—of the numerical methods in optimization are able to find only local minima and quite a few problems are such that there may be many local minima and possibly even more than one global minimum. That is, the global minimum may occur at more than one set of parameters x and may occur on a line or surface.

1.2 Why the general problem is generally uninteresting

While there do exist methods for tackling the general optimization problem, almost all the “real” work of optimization in problems related to statistics and modeling tends to be done by more specialized methods that work on problems that are restricted in some ways by the nature of the objective or the constraints (or lack thereof). Indeed, for a number of particular problems, there are very specialized packages expressly designed to solve them. Unfortunately, the user often has to work quite hard to decide if his or her problem actually matches the design considerations of the specialized package. Seemingly small changes—for example, a condition that parameters must be positive—can render the specialized package useless. On the other hand, a very general tool may be quite tedious for the user to apply easily, because objective functions and constraints may require a very large amount of program code in some cases.

In the real world, the objective function and the constraints are not only functions of but also depend on data; in fact, they may depend on vast arrays of data, particularly in statistical problems involving large systems.

To illustrate, consider the following examples, which, while “small,” illustrate some of the issues we will encounter.

Cobb–Douglas example

The Cobb–Douglas production function (Nash and Walker-Smith, 1987, p. 375) predicts the quantity of production of a commodity as a function of the inputs of (it appears traditional to use a K for this variable) and used, namely,

1.1

A traditional approach to this problem is to take logarithms to get

1.2

However, the two forms imply very different ways in which errors are assumed to exist between the model and real-world data. Let us assume (almost certainly dangerously) that data for and are known precisely, but there may be errors in the data for . Let us use the name . In particular, if we use additive errors of the form

1.3

then we have

1.4

where we have given these errors a particular name . This means that the errors are actually multiplicative in the real scale of the data.

1.5

If we estimate the model using the log form, we can sometimes get quite different estimates of the parameters than using the direct form. The “errors” have different weights in the different scales, and this alters the estimates. If we really believe that the errors are distributed around the direct model with constant variance, then we should not be using the log form, because it implies that the relative errors are distributed with constant variance.

Hobbs' weed infestation example

This problem is also a nonlinear least squares. As we shall see later, it demonstrates a number of computational issues. The problem came across my desk sometime in 1974 when I was working on the development of a program to solve nonlinear least squares estimation problems. I had written several variants of Gauss–Newton methods in BASIC for a Data General NOVA system. This early minicomputer offered a very limited environment of a 10 character per second teletype with paper tape reader and punch that allowed access to a maximum 8K byte (actually 4K word) segment of the machine. Arithmetic was particularly horrible in that floating point used six hexadecimal digits in the mantissa with no guard digit.

The problem was supplied by Mr. Dave Hobbs of Agriculture Canada. As I was told, the observations () are weed densities per unit area over 12 growing periods. I was never given the actual units of the observations. Here are the data (Figure 1.1).

# draw the data y <- c(5.308, 7.24, 9.638, 12.866, 17.069, 23.192, 31.443, 38.558, 50.156, 62.948, 75.995, 91.972) t <- 1:12 plot(t, y) title(main = "Hobbs' weed infestation data", font.main = 4)

Figure 1.1

It was suggested that the appropriate model was a 3-parameter logistic, that is,

1.6

where , is the growing period, and . We shall see later that there are other forms for the model that may give better computational properties.

1.3 (Non-)Linearity

What do we mean by “nonlinear?” The word clearly implies “not a straight line,” and many researchers take this to apply to the model they are trying to estimate. However, for the process of estimation, which generally involves minimizing a loss function such as a sum of squared deviations or maximizing a likelihood function, the key issue is that of solving a set of equations to find the result.

When we minimize the sum of squares for a model that is linear in the parameters, such as the log form of the Cobb–Douglas function (1.2) above where , , and appear only to the first power, we can apply standard calculus to arrive at the normal equations. These are a set of linear equations. However, when we want to minimize the sum of squares from the original model (1.1), it is generally necessary to use an iterative method from some starting set of the parameters , , and .

For the purposes of this book, “nonlinear” will refer to the process of finding a solution and implying that there is no method that finds a solution via a predetermined set of solutions of linear equations. That is, while we use a lot of linear algebra in finding solutions to the problems of interest in this book, we cannot, in advance, specify how many such subproblems are needed.

1.4 Objective function properties

There are some particular forms of the objective function that lead to specialized, but quite common, solution methods. This gives us one dimension or axis by which to categorize the optimization methods we shall consider later.

1.4.1 Sums of squares

If the objective function is a sum of squared terms, we can use a method for solving nonlinear least squares problems. Clearly, the estimation of the Cobb–Douglas production model above by minimizing the sum of squared residuals is a problem of this type.

We note that the Cobb–Douglas problem is linear in the parameters in the case of the log-form model. The linear least squares problem is so pervasive that it is worth noting how it may be solved because some approaches to nonlinear problems can be viewed as solving sequences of linear problems.

1.4.2 Minimax approximation

It is sometimes important to have an upper bound on the deviation of a model from “data.” We, therefore, wish to find the set of parameters in a model that minimizes the maximum deviation, hence a minimax problem. In particular, consider that there may be relatively simple approximations to some specialized and awkward-to-compute special functions. This sort of approximation problem is less familiar to statistical workers than sums-of-squares problems. Moreover, the small residuals may render some traditional methods such as the R function nls() ill-suited to their solution.

1.4.3 Problems with multiple...

Erscheint lt. Verlag	3.4.2014
Sprache	englisch
Themenwelt	Mathematik / Informatik ► Mathematik ► Angewandte Mathematik
	Mathematik / Informatik ► Mathematik ► Computerprogramme / Computeralgebra
	Technik
Schlagworte	algorithms • Computational & Graphical Statistics • conjugate • difficulties • generally • general problem • Gradient • Least Squares • Mathematics • Mathematik • Methods • newtons • Optimierung • Optimization • optimization problem • Overview • Promise • Properties • RechnergestÃ¼tzte u. graphische Statistik • Rechnergestützte u. graphische Statistik • Statistical Software / R • Statistics • Statistik • Statistiksoftware / R • Tasks • termination • uninteresting
ISBN-10	1-118-88396-9 / 1118883969
ISBN-13	978-1-118-88396-9 / 9781118883969

Informationen gemäß Produktsicherheitsverordnung (GPSR)
Haben Sie eine Frage zum Produkt?

PDF (Adobe DRM)
Größe: 3,6 MB

Kopierschutz: Adobe-DRM
Adobe-DRM ist ein Kopierschutz, der das eBook vor Mißbrauch schützen soll. Dabei wird das eBook bereits beim Download auf Ihre persönliche Adobe-ID autorisiert. Lesen können Sie das eBook dann nur auf den Geräten, welche ebenfalls auf Ihre Adobe-ID registriert sind.
Details zum Adobe-DRM

Dateiformat: PDF (Portable Document Format)
Mit einem festen Seitenlayout eignet sich die PDF besonders für Fachbücher mit Spalten, Tabellen und Abbildungen. Eine PDF kann auf fast allen Geräten angezeigt werden, ist aber für kleine Displays (Smartphone, eReader) nur eingeschränkt geeignet.

Systemvoraussetzungen:
PC/Mac: Mit einem PC oder Mac können Sie dieses eBook lesen. Sie benötigen eine Adobe-ID und die Software Adobe Digital Editions (kostenlos). Von der Benutzung der OverDrive Media Console raten wir Ihnen ab. Erfahrungsgemäß treten hier gehäuft Probleme mit dem Adobe DRM auf.
eReader: Dieses eBook kann mit (fast) allen eBook-Readern gelesen werden. Mit dem amazon-Kindle ist es aber nicht kompatibel.
Smartphone/Tablet: Egal ob Apple oder Android, dieses eBook können Sie lesen. Sie benötigen eine Adobe-ID sowie eine kostenlose App.
Geräteliste und zusätzliche Hinweise

Zusätzliches Feature: Online Lesen
Dieses eBook können Sie zusätzlich zum Download auch online im Webbrowser lesen.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

EPUB (Adobe DRM)

Dateiformat: EPUB (Electronic Publication)
EPUB ist ein offener Standard für eBooks und eignet sich besonders zur Darstellung von Belletristik und Sachbüchern. Der Fließtext wird dynamisch an die Display- und Schriftgröße angepasst. Auch für mobile Lesegeräte ist EPUB daher gut geeignet.

Buying eBooks from abroad
For tax law reasons we can sell eBooks just within Germany and Switzerland. Regrettably we cannot fulfill eBook-orders from other countries.

Andere eBook-Ausgabe

EPUB (Adobe DRM)
PDF (Adobe DRM)