กก |
กก
Yu-Hong Dai
1.
Remember, I only finished my PhD in 1963, mostly not doing optimization. But there were the Simplex method for LP, steepest descent, sequential penalty functions, and Levenberg-Gauss-Newton. 2.
I received my first degree from Cambridge University in 1960, in Natural Sciences, majoring in Theoretical Physics. I find this background helps a lot when talking to users of optimization, and has influenced my opinions on what aspects of mathematics are `useful' and what are less so. I received my PhD in 1963 from Leeds University which had one of the few early computer laboratories to be established in a university, centered around a Ferranti Pegasus computer. Colin Reeves supervised my work on a project to develop methods for computing molecular wave functions. It involved many interesting topics such as numerical eigensolutions, symbolic automatic differentiation, tensor algebra and groups, and latterly optimization. The resulting code was one of the forerunners of the large scale packages now used in quantum chemistry. Of course, the ideas were fed to me by Colin Reeves; my input was mainly in getting the programs to work! Certainly this helped my subsequent work in writing optimization codes. 3.
Davidon issued his famous Argonne National Laboratories report in 1959. I don't know who received copies, but I was lucky that Colin Reeves somehow got one and passed it on to me. I coded it and realized it was able to solve problems substantially faster than steepest descent. Mike Powell also got a copy and also realized its potential. Davidon's report was presented in a very unusual way, and Mike was able to extract the essential feature that was involved. We pooled our experience and added some more theory, leading to the DFP paper. 4.
Mike presented a seminar at Leeds, and changed his title at the last moment to describe his experiences with Davidon's method. He found that I was also working on the method, hence the subsequent cooperation. I didn't know Mike previous to that. 5.
I have the greatest respect for both of them. Mike's influence on optimization and approximation theory has been immense, both in developing algorithms, and in establishing the underlying theory. I like very much that he is firmly concentrated on the ultimate aim of finding what works best and on making it available to users. Davidon brought the intuition of a theoretical physicist to bear, and his ideas were very innovative. I enjoyed the company of both of them, and remember many happy outings on the hills in Scotland. 6.
Broyden saw that the method was one of a one parameter family of methods. I found the method from the Sherman-Morrison formula via a symmetry (duality) argument. Goldfarb established it by a variational principle. I don't remember Shanno's reasoning, but it was certainly different again. I don't remember when the acronym BFGS first came into use. 7. I think the paper by Mike Powell on "How bad are the BFGS and DFP methods when the function is quadratic?" sheds the most light on this issue. But my (very limited) experience with formulae outside the convex class indicates that one can possibly improve on BFGS by small amounts, and still retain positive definite Hessians. 8.
I know them all well and respect their work greatly. But I have not done any joint research with them. 9.
There were various
publications on CG for 10.
Colin Reeves was writing
lecture notes on CG for 11.
He was a student of Boys at Cambridge who worked on molecular wave-functions. He worked for a research group at ICI on theoretical chemistry for a short time, before coming to Leeds shortly before I did. He later moved into computer science at Keele University and I rather lost touch. He was a very innovative person and his move to CS was a great loss to numerical analysis. 12.
It was clear early on that DFP and BFGS were superior to FR on small and medium sized problems. Both the FR and DFP methods were a significant improvement on steepest descent, and so attracted favourable comments. 13.
I have very little experience in using CG on large problems. Many large problems are solved more effectively using sparse matrix factors. Also CG doesn't fit comfortably with inequality constrained problems. Preconditioning can help, but finding a good preconditioner is not straightforward. I can only envisage using (preconditioned) CG when nothing else is practicable. This means large problems, for which the Hessian is usually ill-conditioned. Getting even six figures of accuracy in the gradients is usually quite a challenge. But the relationship indicates that the accuracy in the variables might be much less. Unfortunately there's no easy way to check this out since if one could determine the solution accurately, one would use this technique in preference to CG. On the other hand, the occurrence of superlinear convergence in a Newton or quasi-Newton method is a good indication of an accurate solution. 14.
I have over 120 publications in my CV. Most of them seemed interesting at the time. Fashions change, and what seems important today may not be so in ten years time. 15.
There is some evidence that the SR1 method converges faster than BFGS, especially when line searches are not used (say in a trust region context). However there is the problem of retaining a positive definite Hessian with SR1. My proposal in 2005 enables one to stay closer in a sense to SR1, whilst retaining definiteness. The low rank aspect enables the update to be used efficiently for large scale NLP when the dimension of the null space is small. However, I find it very hard to envisage significant new ideas in nonlinear CG and quasi-Newton methods. 16.
Nonlinear Complementarity and MPECs has interested me recently, but currently I'm still working on methods for NLP. I'd like to get back to working more on applications at some time. 17.
I worked at Harwell from 1969 to 1973, in company with Mike Powell and others. But Harwell was becoming commercialised so I took the opportunity to move back into academia, and join the very strong NA group at Dundee. I have been very happy here. 18.
I don't keep records on this, but here is a list of students with whom I have published joint papers. Bill Bradbury (MSc), Tony McCann (MSc), Mike Hebden, Shirley Lill, Mike Jackson, Jim Sinclair, Rob Womersley, Paul Matthews, Mehi Al-Baali, Chenxian Xu, Eduardo Sainz de la Maza, Julian Hall, Sven Leyffer, Suliman Al-Homidan, Eric Chin. Al-Baali is Syrian, Xu is Chinese, Al-Homidan is Saudi and Chin is Malaysian. 19.
I was brought up in the austere times of the second world war and its aftermath. My father was killed in the war. My family was not well off, but were very supportive for me to make the most of my opportunities in life. I was lucky to attend a very good state (grammar) school which encouraged students to progress into university. I gained a state scholarship which enabled me to do this. 20.
I used to play chess at Cambridge and before that I played top board for the England junior side on one occasion. But I gave it up when I found I was being beaten by young players of half my age! I took up with bridge during my PhD and still play regularly. I have long been interested in music, and can play guitar, piano and clarinet, not very well, and not in public! Yes, I'm very keen on hill-walking and Scotland and the North of England is the ideal place to do it. Familiarization with maps, contours, local maxima and saddle points certainly helps in visualizing optimization techniques. But it doesn't help very much in understanding the complexity of high dimensional space. 21.
Famous Grouse?! Alternatively, treat everything you read about with some scepticism, and be prepared to follow your own intuition. But be willing to change your mind when it becomes clear that other ideas have been demonstrated to be superior. And in Numerical Analysis, watch for what the numbers are telling you. Acknowledgements. The interviewer is much indebted to Roger Fletcher for the agreement of interview and to Jorge Nocedal for his suggestion of interviewing. He should thank Masao Fukushima and Yaxiang Yuan very much for their constructive and valuable suggestions during the preparations of the questions and the writing of this article. Gratitude is also due to the library of Kyoto University, which made him look up many early journals very conveniently. กก กก |