Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-28T11:58:01.315Z Has data issue: false hasContentIssue false

Theory of algorithms for unconstrained optimization

Published online by Cambridge University Press:  07 November 2008

Jorge Nocedal
Affiliation:
Department of Electrical Engineering and Computer ScienceNorthwestern University, Evanston, IL 60208USA, E-mail [email protected]

Extract

A few months ago, while preparing a lecture to an audience that included engineers and numerical analysts, I asked myself the question: from the point of view of a user of nonlinear optimization routines, how interesting and practical is the body of theoretical analysis developed in this field? To make the question a bit more precise, I decided to select the best optimization methods known to date – those methods that deserve to be in a subroutine library – and for each method ask: what do we know about the behaviour of this method, as implemented in practice? To make my task more tractable, I decided to consider only algorithms for unconstrained optimization.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1992

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

REFERENCES

Akaike, H. (1959), ‘On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method’, Ann. Inst. Statist. Math. 11, 117.CrossRefGoogle Scholar
Al-Baali, M. (1985), ‘Descent property and global convergence of the Fletcher–Reeves method with inexact line search’, IMA J. Numer. Anal. 5, 121124.CrossRefGoogle Scholar
Al-Baali, M. (1990), ‘Variational Quasi-Newton Methods for Unconstrained Optimization’, Technical Report, Department of Mathematics, University of Damascus (Syria).Google Scholar
Bank, R.E. and Rose, D.J. (1981) ‘Global approximate Newton methods’, Numer. Math. 37, 279295.CrossRefGoogle Scholar
Baptist, P. and Stoer, J. (1977), ‘On the relation between quadratic termination and convergence properties of minimization algorithms, Part II, Applications’, Numer. Math. 28, 367392.CrossRefGoogle Scholar
Beale, E.M.L. (1972), ‘A derivation of conjugate gradients’, in Numerical Methods for Nonlinear Optimization (Lootsma, F.A., ed.), Academic Press (New York).Google Scholar
Brown, P.N. and Saad, Y. (1989), ‘Globally convergent techniques in nonlinear Newton–Krylov algorithms’, Technical Report UCRL-102434, Lawrence Livermore National Laboratory.Google Scholar
Brown, P.N. and Saad, Y. (1990), ‘Hybrid Krylov methods for nonlinear systems of equations’, SIAM J. Sci. Stat. Comput. 11, 450481.CrossRefGoogle Scholar
Broyden, G.G., Dennis, J.E. and Moré, J.J. (1973), ‘On the local and superlinear convergence of quasi-Newton methods’, J. Inst. Math. Applics 12, 223246.CrossRefGoogle Scholar
Buckley, A. and LeNir, A. (1983), ‘QN-like variable storage conjugate gradients’, Math. Program. 27, 155175.CrossRefGoogle Scholar
Burmeister, W. (1973), ‘Die Konvergenzordnung des Fletcher–Powell Algorithmus’, Z. Angew. Math. Mech. 53, 693699.CrossRefGoogle Scholar
Byrd, R.H. and Nocedal, J. (1989), ‘A tool for the analysis of quasi-Newton methods with application to unconstrained minimization’, SIAM J. Numer. Anal. 26, 727739.CrossRefGoogle Scholar
Byrd, R., Liu, D. and Nocedal, J. (1990), ‘On the Behavior of Broyden's Class of Quasi-Newton Methods’, Technical Report NAM 01, Northwestern University, EECS Dept.Google Scholar
Byrd, R.H., Nocedal, J. and Yuan, Y. (1987), ‘Global convergence of a class of quasi-Newton methods on convex problems’, SIAM J. Numer. Anal. 24, 11711190.CrossRefGoogle Scholar
Byrd, R.H., Tapia, R.A. and Zhang, Y. (1990), ‘An SQP Augmented Lagrangian Bfgs Algorithm for Constrained Optimization’, Technical Report, University of Colorado (Boulder).Google Scholar
Cohen, A. (1972), ‘Rate of convergence of several conjugate gradient algorithms’, SIAM J. Numer. Anal. 9, 248259.CrossRefGoogle Scholar
Conn, A.R., Gould, N.I.M., and Toint, Ph.L. (1988a), ‘Global convergence of a class of trust region algorithms for optmization with simple bounds’, SIAM J. Numer. Anal. 25, 433460.CrossRefGoogle Scholar
Conn, A.R., Gould, N.I.M., and Toint, Ph.L. (1988b), ‘Testing a class of methods for solving minimization problems with simple bounds on the variables’, Math. Comput. 50, 399430.CrossRefGoogle Scholar
Conn, A.R., Gould, N.I.M., and Toint, Ph.L. (1991), ‘Convergence of quasi-Newton matrices generated by the symmetric rank one update’, Math. Prog. 2, 177195.CrossRefGoogle Scholar
Crowder, H.P. and Wolfe, P. (1972), ‘Linear convergence of the conjugate gradient method’, IBM J. Res. Dev. 16, 431433.CrossRefGoogle Scholar
Davidon, W.C. (1959), ‘Variable metric methods for minimization’, Argonne National Lab Report (Argonne, IL).Google Scholar
Davidon, W.C. (1975), ‘Optimally conditioned optimization algorithms without line searches’, Math. Prog. 9, 130.CrossRefGoogle Scholar
Dembo, R.S., Eisenstat, S.C., and Steihaug, T. (1982), ‘Inexact Newton methods’, SIAM J. Numer. Anal. 19, 400408.CrossRefGoogle Scholar
Dennis, J.E. Jr and Moré, J.J. (1974), ‘A characterization of superlinear convergence and its application to quasi-Newton methods’, Math. Comp. 28, 549560.CrossRefGoogle Scholar
Dennis, J.E. Jr and Moré, J.J. (1977), ‘Quasi-Newton methods, motivation and theory’, SIAM Review 19, 4689.CrossRefGoogle Scholar
Dennis, J.E. Jr and Schnabel, R.B. (1983), Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Prentice-Hall (Englewood Cliffs, NJ).Google Scholar
Dennis, J.E. Jr and Schnabel, R.B. (1987), ‘A view of unconstrained optimization’, Technical Report CU-CS-376-87, University of Colorado (Boulder), to appear in: Handbooks in Operations Research and Management Science vol. 1: Optimization (G.L. Nemhauser, A.H.G. Rinnooy Kan, and M.J. Todd, eds), North-Holland (Amsterdam).Google Scholar
Dennis, J.E. Jr and Walker, H.F. (1981), ‘Convergence theorems for least-change secant update methods’, SIAM J. Numer. Anal. 18, 949987; 19, 443.CrossRefGoogle Scholar
Dennis, J.E. Jr and Wolkowicz, H. (1991), ‘Sizing and least change secant methods’, Technical Report CORR 90-02, Department of Combinatorics and Optimization, University of Waterloo, Waterloo, Canada.Google Scholar
Dixon, L.C.W. (1972), ‘The choice of step length, a crucial factor in the performance of variable metric algorithms’, in Numerical Methods for Nonlinear Optimization (Lootsma, F.A., ed.), Academic Press (London).Google Scholar
Eisenstat, S. and Walker, H. (1990), ‘Globally convergent inexact Newton methods’, Yale Technical Report.Google Scholar
El Hallabi, M. and Tapia, R.A. (1989), ‘A global convergence theory for arbitrary norm trust-region methods for nonlinear equations’, Technical Report TR87-25, Rice University, Department of Mathematical Sciences.Google Scholar
Fiacco, A.V. and McCormick, G.P. (1968), Nonlinear Programmming, John Wiley & Sons (New York).Google Scholar
Fletcher, R. (1987), Practical Methods of Optimization vol. 1: Unconstrained Optimization John Wiley & Sons (New York).Google Scholar
Fletcher, R. (1991), ‘A new variational result for quasi-Newton formulae’, SIAM J. Optimization 1(1), 1821.CrossRefGoogle Scholar
Fletcher, R. and Powell, M.J.D. (1963), ‘A rapidly convergent descent method for minimization’, Comput. J. 6, 163168.CrossRefGoogle Scholar
Fletcher, R. and Reeves, C. (1964), ‘Function minimization by conjugate gradients’, Comput. J. 7, 149154.CrossRefGoogle Scholar
Gay, D. (1983), ‘Subroutines for unconstrained minimization using a model/ trust-region approach’, ACM Trans. Math. Soft. 9(4), 503524.CrossRefGoogle Scholar
Gilbert, J.C. and Lemaréchal, C. (1989), ‘Some numerical experiments with variable storage quasi-Newton algorithms’, Math. Program. 45, 407436.CrossRefGoogle Scholar
Gilbert, J.C. and Nocedal, J. (1990), ‘Global convergence properties of conjugate gradient methods for optimization’, Rapport de Recherche, INRIA (Paris).Google Scholar
Gill, P.E. and Murray, W. (1979), ‘Conjugate-gradient methods for large-scale nonlinear optimization’, Technical report SOL 79-15, Dept. of Operations Research, Stanford University.CrossRefGoogle Scholar
Gill, P. E., Murray, W. and Wright, M. H. (1981), Practiced Optimization, Academic Press (London).Google Scholar
Goldfarb, D., and Idnani, A. (1983), ‘A numerically stable dual method for solving strictly convex quadric programs’, Math. Prog. 27, 133.CrossRefGoogle Scholar
Griewank, A. (1991), ‘The global convergence of partitioned BFGS on problems with convex decompositions and Lipschitzian gradients’, Math. Prog. 50, 141175.CrossRefGoogle Scholar
Griewank, A. and Toint, Ph.L. (1982a), ‘On the unconstrained optimization of partially separable objective functions’, in Nonlinear Optimization 1981 (Powell, M.J.D., ed.), Academic Press (London), 301312.Google Scholar
Griewank, A. and Toint, Ph.L. (1982b), ‘Local convergence analysis of partitioned quasi-Newton updates’, Numer. Math. 39, 429448.CrossRefGoogle Scholar
Griewank, A. and Toint, Ph.L. (1982c), ‘Partitioned variable metric updates for large structured optimization problems’, Numer. Math. 39, 119137.CrossRefGoogle Scholar
Griewank, A. and Toint, Ph.L. (1984), ‘Numerical experiments with partially separable optimization problems’, in Numerical Analysis: Proceedings Dundee 1983 (Lecture Notes in Mathematics 1066) (Griffiths, D.F., ed.), Springer Verlag, (Berlin), 203220.CrossRefGoogle Scholar
Grippo, L., Lampariello, F. and Lucidi, S. (1990a), ‘A quasi-discrete Newton algorithm with a nonmonote stabilization technique’, J. Optim. Theory Appl. 64, 485500.CrossRefGoogle Scholar
Grippo, L., Lampariello, F. and Lucidi, S. (1990b), ‘A class of nonmonotone stabilization methods in unconstrained optimization’, Technical Report R-290, Consiglio Nazionale delle Ricerche.Google Scholar
Güler, O. (1989), ‘Optimal algorithms for smooth convex programming’, Working Paper Series No. 89-17, Department of Management Sciences, The University of Iowa.Google Scholar
Hamming, R.W. (1971), Introduction to Applied Numerical Analysis, McGraw-Hill (New York).Google Scholar
Hu, Y.F. and Storey, C. (1991), ‘On optimally and near-optimally conditioned quasi-Newton updates’, Technical Report A141, Department of Mathematical Sciences, Loughborough University of Technology (Leicestershire).Google Scholar
Khalfan, H., Byrd, R.H., and Schnabel, R.B. (1990), ‘A theoretical and experimental study of the symmetric rank one update’, Technical Report CU-CS-489-90, University of Colorado (Boulder).CrossRefGoogle Scholar
Lalee, M. and Nocedal, J. (1991), ‘Automatic column scaling strategies for quasi-Newton methods’, Report NAM 04, EECS Department, Northwestern University (Evanston, IL).Google Scholar
Lemaréchal, C. (1981), ‘A view of line searches’, in Optimization and Optimal Control (Lecture Notes in Control and Information Science 30) (Auslander, A., Oettli, W. and Stoer, J., eds) Springer Verlag (Berlin) 5978.CrossRefGoogle Scholar
Liu, D.C. and Nocedal, J. (1989), ‘On the limited memory BFGS method for large scale optimization’, Math. Program. 45, 503528.CrossRefGoogle Scholar
Luenberger, D.G. (1984), Linear and Nonlinear Programming, 2nd edition, Addison-Wesley (Reading, MA).Google Scholar
Luksšan, L. (1991a), ‘Computational experience with improved conjugate gradient methods for unconstrained minimization’, Technical Report 488, Institute of Computer and Information Sciences, Czechoslovak Academy of Sciences (Prague).Google Scholar
Luksšan, L. (1991b), ‘On variationally derived scaling and preconvex variable metric updates’, Technical Report 496, Institute of Computer and Information Sciences, Czechoslovak Academy of Sciences (Prague).Google Scholar
Martínez, J.M. (1990), ‘Local convergence theory of inexact Newton methods based on structured least change updates’, Math. Comp. 55, 143168.CrossRefGoogle Scholar
Martínez, J.M. (1991), ‘On the relation between two local convergence theories of least change update methods’, Technical Report IMECC-UNICAMP.CrossRefGoogle Scholar
Moré, J. (1983), ‘Recent developments in algorithms and software for trust region methods’, in Mathematical Programming, The State of the Art (Bachem, A., Grotschel, M. and Korte, G., eds), Springer-Verlag (Berlin), 256287.Google Scholar
Moré, J. and Sorensen, D.C. (1984), ‘Newton's method’, in Studies in Numerical Analysis (Golub, G.H., ed.), The Mathematical Association of America (Providence RI), 2982.Google Scholar
Moré, J., and Thuente, D.J. (1990) ‘On line search algorithms with guaranteed sufficient decrease’, Mathematics and Computer Science Division Preprint MCS-P153-0590, Argonne National Laboratory (Argonne, IL).Google Scholar
Nash, S.G. (1985), ‘Preconditioning of truncated-Newton methods’, SIAM J. Sci. Stat. Comput, 6, 599616.CrossRefGoogle Scholar
Nazareth, J.L. and Mifflin, R.B. (1991), ‘The least prior deviation quasi-Newton update’, Technical Report, Department of Pure and Applied Mathematics, Washington State University.Google Scholar
Nemirovsky, A.S. and Yudin, D. B. (1983), Problem Complexity and Method Efficiency, Wiley (New York).Google Scholar
Nesterov, Y.E. (1983), ‘A method of solving a convex programming problem with convergence rate O(1/k 2)’, Sov, Math. Dokl. 27, 372376.Google Scholar
Nesterov, Y.E. (1988), ‘On an approach to the construction of optimal methods of minimization of smooth convex functions’, Ekonom. i Matem. 24, 509517.Google Scholar
Nocedal, J. (1980), ‘Updating quasi-Newton matrices with limited storage’, Math. Comput. 35, 773782.CrossRefGoogle Scholar
Nocedal, J. (1990), ‘The performance of several algorithms for large scal unconstrained optimization’, in Large-Scale Numerical Optimization (Coleman, T.F. and Li, Y., eds), SIAM (Philadelphia), 138151.Google Scholar
Nocedal, J. and Yuan, Y. (1991), ‘Analysis of a self-scaling quasi-Newton method’, Technical Report NAM-02, Department of Electrical Engineering and Computer Science, Northwestern University (Evanston, IL).Google Scholar
O'Leary, D.P. (1982), ‘A discrete Newton algorithm for minimizing a function of many variables’, Math. Prog. 23, 2033.CrossRefGoogle Scholar
Oren, S.S. (1982), ‘Perspectives on self-scaling variable metric algorithms’, J. Opt. Theory Appl. 37, 137147.CrossRefGoogle Scholar
Oren, S.S. and Luenberger, D.G. (1974), ‘Self-scaling variable metric(SSVM) Algoriths I: Criteria and sufficient conditions for scaling a class of algorithms’, Management Sci. 20, 845862.CrossRefGoogle Scholar
Ortega, J.M. and Rheinboldt, W.C. (1970), Iterative Solution of Nonlinear Equations in Several Variables, Academic Press (New York).Google Scholar
Osborne, M.R. and Sun, L.P. (1988), ‘A new approach to the symmetric rank-one updating algorithm’, Report NMO/01, Department of Statistics, IAS, Australian National University.Google Scholar
Ostrowski, A. (1966), Solution of Equations and Systems of Equations, second edition, Academic Press (New York).Google Scholar
Pearson, J.D. (1969), ‘Variable metric methods of minimization’, Comput. J. 12, 171178.CrossRefGoogle Scholar
Polak, E. and Ribière, G. (1969), ‘Note sur la convergence de methodes de directions conjugées’, Rev. Française Informat Recherche Operationelle, 3e Année 16, 3543.Google Scholar
Powell, M.J.D. (1971), ‘On the convergence of the variable metric algorithm’, J. Inst. Math. Applics 7, 2136.CrossRefGoogle Scholar
Powell, M.J.D. (1972), ‘Some properties of the variable metric method’, in Numerical Methods for Nonlinear Optimization (Lootsma, F.A., ed.), Academic Press (London).Google Scholar
Powell, M.J.D. (1976a), ‘Some global convergence properties of a variable metric algorithm for minimization without exact line searches’, in Nonlinear Programming, SIAM-AMS Proceedings, Vol. IX, (Cottle, R.W. and Lemke, C.E., eds), SIAM Publications (Philadelphia).Google Scholar
Powell, M.J.D. (1976b), ‘Some convergence properties of the conjugate gradient method’, Math. Program. 11, 4249.CrossRefGoogle Scholar
Powell, M.J.D. (1977), ‘Restart procedures of the conjugate gradient method’, Math. Program. 2, 241254J.CrossRefGoogle Scholar
Powell, M.J.D. (1983), ‘On the rate of convergence of variable metric algorithms for unconstrained optimization’, Report DAMTP 1983/NA7, Department of Applied Mathematics and Theoretical Physics, University of Cambridge (Cambridge).Google Scholar
Powell, M.J.D. (1984a), ‘Nonconvex minimization calculations and the conjugate gradient method’, in Lecture Notes in Mathematics vol. 1066, Springer-Verlag (Berlin), 122141.Google Scholar
Powell, M.J.D. (1984b), ‘On the global convergence of trust region algorithms for unconstrained minimization’, Math. Program. 29, 297303.CrossRefGoogle Scholar
Powell, M.J.D. (1985), ‘Convergence properties of algorithms for nonlinear optimization’, Report DAMTP 1985/NA1, Department of Applied Mathematics and Theoretical Physics, University of Cambridge (Cambridge).Google Scholar
Powell, M.J.D. (1986), ‘How bad are the BFGS and DFP methods when the objective function is quadratic?’, Math. Prog. 34, 3447.CrossRefGoogle Scholar
Powell, M.J.D. (1987), ‘Update conjugate directions by the BFGS formula’, Math. Prog. 38, 2946.CrossRefGoogle Scholar
Pu, D.G. and Yu, W.C. (1988), ‘On the convergence property of the DFP algorithm’, J. Qüfu Normol University 14(3), 6369.Google Scholar
Ritter, K. (1980), ‘On the rate of superlinear convergence of a class of variable metric methods’, Numer. Math. 35, 293313.CrossRefGoogle Scholar
Schnabel, R.B. (1978), ‘Optimal conditioning in the convex class of rank two updates’, Math. Prog. 15, 247260.CrossRefGoogle Scholar
Schnabel, R.B. (1989), ‘Sequential and parallel methods for unconstrained optimization’, in Mathematical Programming, Recent Developments and Applications (Iri, M. and Tanabe, K., eds), Kluwer (Deventer), 227261.Google Scholar
Schuller, G. (1974), ‘On the order of convergence of certain quasi-Newton methods’, Numer. Math. 23, 181192.CrossRefGoogle Scholar
Shanno, D.F. (1978a), ‘On the convergence of a new conjugate gradient algorithm’, SIAM J. Numer. Anal. 15, 12471257.CrossRefGoogle Scholar
Shanno, D.F. (1978b), ‘Conjugate gradient methods with inexact searches’, Math. Operations Res. 3, 244256.CrossRefGoogle Scholar
Shanno, D.F. and Phua, K.H. (1980), ‘Remark on algorithm 500: minimization of unconstrained multivariate functions’, ACM Trans, on Math. Software 6, 618622.CrossRefGoogle Scholar
Siegel, D. (1991), ‘Modifying the BFGS update by a new column scaling technique’, Technical Report DAMTP 1991/NA5, Department of Applied Mathematics and Theoretical Physics, University of Cambridge.Google Scholar
Steihaug, T. (1983), ‘The conjugate gradient method and trust regions in large scale optimization’, SIAM J. Numer. Anal. 20, 626637.CrossRefGoogle Scholar
Stoer, J. (1977), ‘On the relation between quadratic termination and convergence properties of minimization algorithms’, Numer. Math. 28, 343366.CrossRefGoogle Scholar
Toint, Ph.L. (1983), ‘VE08AD, a routine for partially separable optimization with bounded variables’, Harwell Subroutine Library, AERE (UK).Google Scholar
Toint, Ph.L. (1986a), ‘A view of nonlinear optimization in a large number of variables’, Technical Report 86/16, Facultés Universitaires de Namur.Google Scholar
Toint, Ph.L. (1986b), ‘Global convergence of the partioned BFGS algorithm for convex partially separable optimization’, Math. Prog. 36, 290306.CrossRefGoogle Scholar
Torczon, V. (1991), ‘On the convergence of the multidimensional search algorithm’, SIAM J. Optimization 1(1), 123145.CrossRefGoogle Scholar
Touati-Ahmed, D. and Storey, C. (1990), ‘Efficient hybrid conjugate gradient techniques’, J. Optimization Theory Appl. 64, 379397.CrossRefGoogle Scholar
Warth, W. and Werner, J. (1977), ‘Effiziente Schrittweitenfunktionen bei unrestringierten Optimierungsaufgaben’, Computing 19, 1, 5972.CrossRefGoogle Scholar
Werner, J. (1978), ‘Uber die globale konvergenz von Variable-Metric Verfahren mit nichtexakter Schrittweitenbestimmung’, Numer. Math. 31, 321334.CrossRefGoogle Scholar
Werner, J. (1989), ‘Global convergence of quasi-Newton methods with practical line searches’, NAM-Bericht 67, Institut für Numerische und Angewandte Mathematik der Universität Göttingen.Google Scholar
Wilkinson, J.H. (1965), The Algebraic Eigenvalue Problem, Oxford University Press (London).Google Scholar
Wolfe, P. (1969), ‘Convergence conditions for ascent methods’, SIAM Review 11, 226235.CrossRefGoogle Scholar
Wolfe, P. (1971), ‘Convergence conditions for ascent methods. II: Some corrections’, SIAM Review 13, 185188.CrossRefGoogle Scholar
Yuan, Y. (1991), ‘A modified BFGS algorithm for unconstrained optimization’, IMA J. Numer. Anal. 11(3), 325332.CrossRefGoogle Scholar
Yuan, Y. and Byrd, R. (1991), ‘Nonquasi-Newton updates for unconstrained optimization’, Technical Report, Department of Computer Science, University of Colorado (Boulder).Google Scholar
Zhang, Y. and Tewarson, R.P. (1988), ‘Quasi-Newton algorithms with updates from the pre-convex part of Broyden's family’, IMA J. Numer. Anal. 8, 487509.CrossRefGoogle Scholar
Zoutendijk, G. (1970), ‘Nonlinear Programming, Computational Methods’, in Integer and Nonlinear Programming (Abadie, J., ed.), North-Holland (Amsterdam), 3786.Google Scholar