Skip to main content
Log in

Locally Weighted Learning

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

This paper surveys locally weighted learning, a form of lazy learning and memory-based learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, interference between old and new data, implementing locally weighted learning efficiently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • AAAI-91 (1991). Ninth National Conference on Artificial Intelligence. AAAI Press/The MIT Press, Cambridge, MA.

    Google Scholar 

  • Aha, D. W. (1989). Incremental, instance-based learning of independent and graded concept descriptions. In Sixth International Machine Learning Workshop, pp. 387–391. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Aha, D. W. (1990). A Study of Instance-Based Algorithms for Supervised Learning Tasks: Mathematical, Empirical, and Psychological Observations. PhD dissertation, University of California, Irvine, Department of Information and Computer Science.

    Google Scholar 

  • Aha, D. W. (1991). Incremental constructive induction: An instance-based approach. In Eighth International Machine Learning Workshop, pp. 117–121. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Aha, D. W. & Goldstone, R. L. (1990). Learning attribute relevance in context in instance-based learning algorithms. In 12th Annual Conference of the Cognitive Science Society, pp. 141–148. Lawrence Erlbaum, Cambridge, MA.

    Google Scholar 

  • Aha, D. W. & Goldstone, R. L. (1992). Concept learning and flexible weighting. In 14th Annual Conference of the Cognitive Science Society, pp. 534–539, Bloomington, IL. Lawrence Erlbaum Associates, Mahwah, NJ.

    Google Scholar 

  • Aha, D. W. & Kibler, D. (1989). Noise-tolerant instance-based learning algorithms. In Eleventh International Joint Conference on Artificial Intelligence, pp 794–799. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Aha, D. W. & McNulty, D. M. (1989). Learning relative attribute weights for instance-based concept descriptions. In 11th Annual Conference of the Cognitive Science Society, pp. 530–537. Lawrence Erlbaum Associates, Mahwah, NJ.

    Google Scholar 

  • Aha, D. W. & Salzberg, S. L. (1993). Learning to catch: Applying nearest neighbor algorithms to dynamic control tasks. In Proceedings of the Fourth International Workshop on Artificial Intelligence and Statistics, pp. 363–368, Ft. Lauderdale, FL.

  • Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician 46(3): 175–185.

    Google Scholar 

  • Atkeson, C. G. (1990). Using local models to control movement. In Touretzky, D. S., editor, Advances In Neural Information Processing Systems 2, pp. 316–323. Morgan Kaufman, San Mateo, CA.

    Google Scholar 

  • Atkeson, C. G. (1992). Memory-based approaches to approximating continuous functions. In Casdagli and Eubank (1992), pp. 503–521. Proceedings of a Workshop on Nonlinear Modeling and Forecasting September 17–21, 1990, Santa Fe, New Mexico.

  • Atkeson, C. G. (1996). Local learning. http://www.cc.gatech.edu/fac/Chris.Atkeson/local-learning/.

  • Atkeson, C. G., Moore, A. W. & Schaal, S. (1997). Locally weighted learning for control. Artificial Intelligence Review, this issue.

  • Atkeson, C. G. & Reinkensmeyer, D. J. (1988). Using associative content-addressable memories to control robots. In Proceedings of the 27th IEEE Conference on Decision and Control, volume 1, pp. 792–797, Austin, Texas. IEEE Cat. No.88CH2531–2.

    Google Scholar 

  • Atkeson, C. G. & Reinkensmeyer, D. J. (1989). Using associative content-addressable memories to control robots. In Proceedings, IEEE International Conference on Robotics and Automation, Scottsdale, Arizona.

  • Atkeson, C. G. & Schaal, S. (1995). Memory-based neural networks for robot learning. Neurocomputing 9: 243–269.

    Google Scholar 

  • Baird, L. C. & Klopf, A. H. (1993). Reinforcement learning with high-dimensional, continuous actions. Technical Report WL-TR–93–1147, Wright Laboratory, Wright-Patterson Air Force Base Ohio. http://kirk.usafa.af.mil/∼baird/papers/index.html.

    Google Scholar 

  • Barnhill, R. E. (1977). Representation and approximation of surfaces. In Rice, J. R., editor, Mathematical Software III, pp. 69–120. Academic Press, New York, NY.

    Google Scholar 

  • Batchelor, B. G. (1974). Practical Approach To Pattern Classification. Plenum Press, New York, NY.

    Google Scholar 

  • Benedetti, J. K. (1977). On the nonparametric estimation of regression functions. Journal of the Royal Statistical Society, Series B 39: 248–253.

    Google Scholar 

  • Bentley, J. L. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM 18(9): 509–517.

    Google Scholar 

  • Bentley, J. L. & Friedman, J. H. (1979). Data structures for range searching. ACM Comput. Surv. 11(4): 397–409.

    Google Scholar 

  • Bentley, J. L., Weide, B. & Yao, A. (1980). Optimal expected time algorithms for closest point problems. ACM Transactions on Mathematical Software 6: 563–580.

    Google Scholar 

  • Blyth, S. (1993). Optimal kernel weights under a power criterion. Journal of the American Statistical Association 88(424): 1284–1286.

    Google Scholar 

  • Bottou, L. & Vapnik, V. (1992). Local learning algorithms. Neural Computation 4(6): 888–900.

    Google Scholar 

  • Bregler, C. & Omohundro, S. M. (1994). Surface learning with applications to lipreading. In Cowan et al. (1994), pp. 43–50.

  • Brockmann, M., Gasser, T. & Herrmann, E. (1993). Locally adaptive bandwidth choice for kernel regression estimators. Journal of the American Statistical Association, 88(424): 1302–1309.

    Google Scholar 

  • Broder, A. J. (1990). Strategies for efficient incremental nearest neighbor search. Pattern Recognition 23: 171–178.

    Google Scholar 

  • Callan, J. P., Fawcett, T. E. & Rissland, E. L. (1991). CABOT: An adaptive approach to case based search. In IJCAI 12 (1991), pp. 803–808.

    Google Scholar 

  • Casdagli, M. & Eubank, S. (eds.) (1992). Nonlinear Modeling and Forecasting. Proceedings Volume XII in the Santa Fe Institute Studies in the Sciences of Complexity. Addison Wesley, New York, NY. Proceedings of a Workshop on Nonlinear Modeling and Forecasting September 17–21, 1990, Santa Fe, New Mexico.

  • Cheng, P. E. (1984). Strong consistency of nearest neighbor regression function estimators. Journal of Multivariate Analysis 15: 63–72.

    Google Scholar 

  • Cleveland, W. S. (1979). Robust locally weighted regression and smoothing scatterplots. Journal of the American Statistical Association 74: 829–836.

    Google Scholar 

  • Cleveland, W. S. (1993a). Coplots, nonparametric regression, and conditionally parametric fits. Technical Report 19, AT&T Bell Laboratories, Statistics Department, Murray Hill, NJ. http://netlib.att.com/netlib/att/stat/doc/.

    Google Scholar 

  • Cleveland, W. S. (1993b). Visualizing Data. Hobart Press, Summit, NJ. books@hobart.com.

    Google Scholar 

  • Cleveland, W. S. & Devlin, S. J. (1988). Locally weighted regression: An approach to regression analysis by local fitting. Journal of the American Statistical Association 83: 596–610.

    Google Scholar 

  • Cleveland, W. S., Devlin, S. J. & Grosse, E. (1988). Regression by local fitting: Methods, properties, and computational algorithms. Journal of Econometrics 37: 87–114.

    Google Scholar 

  • Cleveland, W. S. & Grosse, E. (1991). Computational methods for local regression. Statistics and Computing 1(1): 47–62. ftp://cm.bell-labs.com/cm/cs/doc/91/4–04.ps.gz.

    Google Scholar 

  • Cleveland, W. S., Grosse, E. & Shyu, W. M. (1992). Local regression models. In Chambers, J. M. & Hastie, T. J. (eds.), Statistical Models in S, pp. 309–376. Wadsworth, Pacific Grove, CA. http://netlib.att.com/netlib/a/cloess.ps.Z.

    Google Scholar 

  • Cleveland, W. S. & Loader, C. (1994a). Computational methods for local regression. Technical Report 11, AT&T Bell Laboratories, Statistics Department, Murray Hill, NJ. http://netlib.att.com/netlib/att/stat/doc/.

    Google Scholar 

  • Cleveland, W. S. & Loader, C. (1994b). Local fitting for semiparametric (nonparametric) regression: Comments on a paper of Fan and Marron. Technical Report 8, AT&T Bell Laboratories, Statistics Department, Murray Hill, NJ. http://netlib.att.com/netlib/att/stat/doc/, 94.8.ps, earlier version is 94.3.ps.

    Google Scholar 

  • Cleveland, W. S. & Loader, C. (1994c). Smoothing by local regression: Principles and methods. Technical Report 95.3, AT&T Bell Laboratories, Statistics Department, Murray Hill, NJ. http://netlib.att.com/netlib/att/stat/doc/.

    Google Scholar 

  • Cleveland, W. S., Mallows, C. L. & McRae, J. E. (1993). ATS methods: Nonparametric regression for non-Gaussian data. Journal of the American Statistical Association 88(423): 821–835.

    Google Scholar 

  • Connell, M. E. & Utgoff, P. E. (1987). Learning to control a dynamic physical system. In Sixth National Conference on Artificial Intelligence, pp. 456–460, Seattle, WA. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Cost, S. & Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning 10(1): 57–78.

    Google Scholar 

  • Coughran, Jr., W. M. & Grosse, E. (1991). Seeing and hearing dynamic loess surfaces. In Interface '91 Proceedings, pp. 224–228. Springer-Verlag. ftp://cm.bell-labs.com/cm/cs/doc/91/4–07.ps.gz or 4–07long.ps.gz.

  • Cowan, J. D., Tesauro, G. & Alspector, J. (eds.) (1994). Advances In Neural Information Processing Systems 6. Morgan Kaufman, San Mateo, CA.

    Google Scholar 

  • Crain, I. K. & Bhattacharyya, B. K. (1967). Treatment of nonequispaced two dimensional data with a digital computer. Geoexploration 5: 173–194.

    Google Scholar 

  • Deheuvels, P. (1977). Estimation non-paramétrique del la densité par histogrammes généralisés. Revue Statistique Appliqué 25: 5–42.

    Google Scholar 

  • Deng, K. & Moore, A. W. (1995). Multiresolution instance-based learning. In Fourteenth International Joint Conference on Artificial Intelligence, pp. 1233–1239. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Dennis, J. E., Gay, D. M. & Welsch, R. E. (1981). An adaptive nonlinear least-squares algorithm. ACM Transactions on Mathematical Software 7(3): 369–383.

    Google Scholar 

  • Devroye, L. (1981). On the almost everywhere convergence of nonparametric regression function estimates. The Annals of Statistics 9(6): 1310–1319.

    Google Scholar 

  • Diebold, F. X. & Nason, J. A. (1990). Nonparametric exchange rate prediction? Journal of International Economics 28: 315–332.

    Google Scholar 

  • Dietterich, T. G., Wettschereck, D., Atkeson, C. G. & Moore, A. W. (1994). Memory-based methods for regression and classification. In Cowan et al. (1994), pp. 1165–1166.

  • Draper, N. R. & Smith, H. (1981). Applied Regression Analysis. John Wiley, New York, NY, 2nd edition.

    Google Scholar 

  • Elliot, T. & Scott, P. D. (1991). Instance-based and generalization-based learning procedures applied to solving integration problems. In Proceedings of the Eighth Conference of the Society for the Study of Artificial Intelligence, pp. 256–265, Leeds, England. Springer Verlag.

    Google Scholar 

  • Epanechnikov, V. A. (1969). Nonparametric estimation of a multivariate probability density. Theory of Probability and Its Applications 14: 153–158.

    Google Scholar 

  • Eubank, R. L. (1988). Spline Smoothing and Nonparametric Regression. Marcel Dekker, New York, NY.

    Google Scholar 

  • Falconer, K. J. (1971). A general purpose algorithm for contouring over scattered data points. Technical Report NAC 6, National Physical Laboratory, Teddington, Middlesex, United Kingdon, TW11 0LW.

    Google Scholar 

  • Fan, J. (1992). Design-adaptive nonparametric regression. Journal of the American Statistical Association 87(420): 998–1004.

    Google Scholar 

  • Fan, J. (1993). Local linear regression smoothers and their minimax efficiencies. Annals of Statistics 21: 196–216.

    Google Scholar 

  • Fan, J. (1995). Local modeling. EES Update: written for the Encyclopedia of Statistics Science, http://www.stat.unc.edu/faculty/fan/papers.html.

  • Fan, J. & Gijbels, I. (1992). Variable bandwidth and local linear regression smoothers. The Annals of Statistics 20(4): 2008–2036.

    Google Scholar 

  • Fan, J. & Gijbels, I. (1994). Censored regression: Local linear approximations and their applications. Journal of the American statistical Association 89: 560–570.

    Google Scholar 

  • Fan, J. & Gijbels, I. (1995a). Adaptive order polynomial fitting: Bandwidth robustification and bias reduction. J. Comp. Graph. Statist. 4: 213–227.

    Google Scholar 

  • Fan, J. & Gijbels, I. (1995b). Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation. Journal of the Royal Statistical Society B 57: 371–394.

    Google Scholar 

  • Fan, J. & Gijbels, I. (1996). Local Polynomial Modeling and its Applications. Chapman and Hall, London.

    Google Scholar 

  • Fan, J. & Hall, P. (1994). On curve estimation by minimizing mean absolute deviation and its implications. The Annals of Statistics 22(2): 867–885.

    Google Scholar 

  • Fan, J. & Kreutzberger, E. (1995). Automatic local smoothing for spectral density estimation. ftp://stat.unc.edu/pub/fan/spec.ps.

  • Fan, J. & Marron, J. S. (1993). Comment on [Hastie and Loader, 1993]. Statistical Science 8(2): 129–134.

    Google Scholar 

  • Fan, J. & Marron, J. S. (1994a). Fast implementations of nonparametric curve estimators. Journal of Computational and Statistical Graphics 3: 35–56.

    Google Scholar 

  • Fan, J. & Marron, J. S. (1994b). Rejoinder to discussion of Cleveland and Loader.

  • Farmer, J. D. & Sidorowich, J. J. (1987). Predicting chaotic time series. Physical Review Letters 59(8): 845–848.

    Google Scholar 

  • Farmer, J. D. & Sidorowich, J. J. (1988a). Exploiting chaos to predict the future and reduce noise. In Lee, Y. C. (ed.), Evolution, Learning, and Cognition, pp. 277----World Scientific Press, NJ. also available as Technical Report LA-UR–88–901, Los Alamos National Laboratory, Los Alamos, New Mexico.

    Google Scholar 

  • Farmer, J. D. & Sidorowich, J. J. (1988b). Predicting chaotic dynamics. In Kelso, J. A. S., Mandell, A. J. & Schlesinger, M. F. (eds.), Dynamic Patterns in Complex Systems, pp. 265–292. World Scientific, NJ.

    Google Scholar 

  • Farwig, R. (1987). Multivariate interpolation of scattered data by moving least squares methods. In Mason, J. C. & Cox, M. G. (eds.), Algorithms for Approximation, pp. 193–211. Clarendon Press, Oxford.

    Google Scholar 

  • Fedorov, V. V., Hackl, P. & Müller, W. G. (1993). Moving local regression: The weight function. Nonparametric Statistics 2(4): 355–368.

    Google Scholar 

  • Franke, R. & Nielson, G. (1980). Smooth interpolation of large sets of scattered data. International Journal for Numerical Methods in Engineering 15: 1691–1704.

    Google Scholar 

  • Friedman, J. H. (1984). A variable span smoother. Technical Report LCS 5, Stanford University, Statistics Department, Stanford, CA.

    Google Scholar 

  • Friedman, J. H. (1994). Flexible metric nearest neighbor classification. http://playfair.stanford.edu/reports/friedman/.

  • Friedman, J. H., Bentley, J. L. & Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time. ACM Transactions on Mathematical Software 3(3): 209–226.

    Google Scholar 

  • Fritzke, B. (1995). Incremental learning of local linear mappings. In Proceedings of the International Conference on Artificial Neural Networks ICANN '95, pp. 217–222, Paris, France.

  • Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition. Academic Press, New York, NY, second edition.

    Google Scholar 

  • Gasser, T. & Müller, H. G. (1979). Kernel estimation of regression functions. In Gasser, T. & Rosenblatt, M. (eds.), Smoothing Techniques for Curve Estimation, number 757 in Lecture Notes in Mathematics, pp. 23–67. Springer-Verlag, Heidelberg.

    Google Scholar 

  • Gasser, T. & Müller, H. G. (1984). Estimating regression functions and their derivatives by the kernel method. Scandanavian Journal of Statistics 11: 171–185.

    Google Scholar 

  • Gasser, T., Müller, H. G. & Mammitzsch, V. (1985). Kernels for nonparametric regression. Journal of the Royal Statistical Society, Series B 47: 238–252.

    Google Scholar 

  • Ge, Z., Cavinato, A. G. & Callis, J. B. (1994). Noninvasive spectroscopy for monitoring cell density in a fermentation process. Analytical Chemistry 66: 1354–1362.

    Google Scholar 

  • Goldberg, K. Y. & Pearlmutter, B. (1988). Using a neural network to learn the dynamics of the CMU Direct-Drive Arm II. Technical Report CMU-CS–88–160, Carnegie-Mellon University, Pittsburgh, PA.

    Google Scholar 

  • Gorinevsky, D. & Connolly, T. H. (1994). Comparison of some neural network and scattered data approximations: The inverse manipulator kinematics example. Neural Computation 6: 521–542.

    Google Scholar 

  • Goshtasby, A. (1988). Image registration by local approximation methods. Image and Vision Computing 6(4): 255–261.

    Google Scholar 

  • Grosse, E. (1989). LOESS: Multivariate smoothing by moving least squares. In Chui, C. K., Schumaker, L. L. & Ward, J. D. (eds.), Approximation Theory VI, pp. 1–4. Academic Press, Boston, MA.

    Google Scholar 

  • Hammond, S. V. (1991). Nir analysis of antibiotic fermentations. In Murray, I. & Cowe, I. A. (eds.), Making Light Work: Advances in Near Infrared Spectroscopy, pp. 584–589. VCH: New York, NY. Developed from the 4th International Conference on Near Infrared Spectroscopy, Aberdeen, Scotland, August 19–23, 1991.

    Google Scholar 

  • Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. & Stahel, W. A. (1986). Robust Statistics: The Approach Based On Influence Functions. John Wiley, New York, NY.

    Google Scholar 

  • Härdle, W. (1990). Applied Nonparametric Regression. Cambridge University Press, New York, NY.

    Google Scholar 

  • Hastie, T. & Loader, C. (1993). Local regression: Automatic kernel carpentry. Statistical Science 8(2): 120–143.

    Google Scholar 

  • Hastie, T. J. & Tibshirani, R. J. (1990). Generalized Additive Regression. Chapman Hall, London.

    Google Scholar 

  • Hastie, T. J. & Tibshirani, R. J. (1994). Discriminant adaptive nearest neighbor classification. ftp://playfair.Stanford.EDU/pub/hastie/dann.ps.Z.

  • Higuchi, T., Kitano, H., Furuya, T., ichi Handa, K., Takahashi, N. & Kokubu, A. (1991). IXM2: A parallel associative processor for knowledge processing. In AAAI-9 (1991), pp. 296–303.

  • Hillis, D. (1985). The Connection Machine. MIT Press, Cambridge, MA.

    Google Scholar 

  • Huang, P. S. (1996). Planning For Dynamic Motions Using A Search Tree. MS thesis, University of Toronto, Graduate Department of Computer Science. http://www.dgp.utoronto.ca/people/psh/home.html.

  • IJCAI 12 (1991). Twelfth International Joint Conference on Artificial Intelligence. Morgan Kaufmann, San Mateo, CA.

  • IJCAI 13 (1993). Thirteenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann, San Mateo, CA.

  • Jabbour, K., Riveros, J. F. W., Landsbergen, D. & Meyer, W. (1987). ALFA: Automated load forecasting assistant. In Proceedings of the 1987 IEEE Power Engineering Society Summer Meeting, San Francisco, CA.

  • James, M. (1985). Classification Algorithms. John Wiley and Sons, New York, NY.

    Google Scholar 

  • Jones, M. C., Davies, S. J. & Park, B. U. (1994). Versions of kernel-type regression estimators. Journal of the American Statistical Association 89(427): 825–832.

    Google Scholar 

  • Karalič, A. (1992). Employing linear regression in regression tree leaves. In Neumann, B. (ed.), ECAI 92: 10th European Conference on Artificial Intelligence, pp. 440–441, Vienna, Austria. John Wiley and Sons.

    Google Scholar 

  • Katkovnik, V. Y. (1979). Linear and nonlinear methods of nonparametric regression analysis. Soviet Automatic Control 5: 25–34.

    Google Scholar 

  • Kazmierczak, H. & Steinbuch, K. (1963). Adaptive systems in pattern recognition. IEEE Transactions on Electronic Computers EC-12: 822–835.

    Google Scholar 

  • Kibler, D., Aha, D. W. & Albert, M. (1989). Instance-based prediction of real-valued attributes. Computational Intelligence 5: 51–57.

    Google Scholar 

  • Kitano, H. (1993a). Challenges of massive parallelism. In IJCAI 13 (1993), pp. 813–834.

  • Kitano, H. (1993b). A comprehensive and practical model of memory-based machine translation. In IJCAI 13 (1993), pp. 1276–1282.

  • Kitano, H. & Higuchi, T. (1991a). High performance memory-based translation on IXM2 massively parallel associative memory processor. In AAAI-9 (1991), pp. 149–154.

  • Kitano, H. & Higuchi, T. (1991b). Massively parallel memory-based parsing. In IJCAI 12 (1991), pp. 918–924.

  • Kitano, H., Moldovan, D. & Cha, S. (1991). High performance natural language processing on semantic network array processor. In IJCAI 12 (1991), pp. 911–917.

  • Kozek, A. S. (1992). A new nonparametric estimation method: Local and nonlinear. Interface 24: 389–393.

    Google Scholar 

  • Lancaster, P. (1979). Moving weighted least-squares methods. In Sahney, B. N. (ed.), Polynomial and Spline Approximation, pp. 103–120. D. Reidel Publishing, Boston, MA.

    Google Scholar 

  • Lancaster, P. & Šalkauskas, K. (1981). Surfaces generated by moving least squares methods. Mathematics of Computation 37(155): 141–158.

    Google Scholar 

  • Lancaster, P. & Šalkauskas, K. (1986). Curve And Surface Fitting. Academic Press, New York, NY.

    Google Scholar 

  • Lawrence, S., Tsoi, A. C. & Black, A. D. (1996). Function approximation with neural networks and local methods: Bias, variance and smoothness. In Australian Conference on Neural Networks, Canberra, Australia, Canberra, Australia. available from http://www.neci.nj.nec.com/homepages/lawrence and http://www.elec.uq.edu.au/∼lawrence.

  • LeBaron, B. (1990). Forecast improvements using a volatility index. Unpublished.

  • LeBaron, B. (1992). Nonlinear forecasts for the S&P stock index. In Casdagli and Eubank (1992), pp. 381–393. Proceedings of a Workshop on Nonlinear Modeling and Forecasting September 17–21, 1990, Santa Fe, New Mexico.

  • Legg, M. P. C. & Brent, R. P. (1969). Automatic contouring. In 4th Australian Computer Conference, pp. 467–468.

  • Lejeune, M. (1984). Optimization in non-parametric regression. In COMPSTAT 1984: Proceedings in Computational Statistics, pp. 421–426, Prague. Physica-Verlag Wien.

    Google Scholar 

  • Lejeune, M. (1985). Estimation non-paramétrique par noyaux: Régression polynômial mobile. Revue de Statistique Appliquée 23(3): 43–67.

    Google Scholar 

  • Lejeune, M. & Sarda, P. (1992). Smooth estimators of distribution and density functions. Computational Statistics & Data Analysis 14: 457–471.

    Google Scholar 

  • Li, K. C. (1984). Consistency for cross-validated nearest neighbor estimates in nonparametric regression. The Annals of Statistics 12: 230–240.

    Google Scholar 

  • Loader, C. (1994). Computing nonparametric function estimates. Technical Report 7, AT&T Bell Laboratories, Statistics Department, Murray Hill, NJ. Available by anonymous FTP from netlib.att.com in /netlib/att/stat/doc/94/7.ps.

    Google Scholar 

  • Lodwick, G. D. & Whittle, J. (1970). A technique for automatic contouring field survey data. Australian Computer Journal 2: 104–109.

    Google Scholar 

  • Lowe, D. G. (1995). Similarity metric learning for a variable-kernel classifier. Neural Computation 7: 72–85.

    Google Scholar 

  • Maron, O. & Moore, A. W. (1997). The racing algorithm: Model selection for lazy learners. Artificial Intelligence Review, this issue.

  • Marron, J. S. (1988). Automatic smoothing parameter selection: A survey. Empirical Economics 13: 187–208.

    Google Scholar 

  • McCallum, R. A. (1995). Instance-based utile distinctions for reinforcement learning with hidden state. In Prieditis & Russell (eds.) (1995), pp. 387–395.

  • McIntyre, D. B., Pollard, D. D. & Smith, R. (1968). Computer programs for automatic contouring. Technical Report Kansas Geological Survey Computer Contributions 23, University of Kansas, Lawrence, KA.

    Google Scholar 

  • McLain, D. H. (1974). Drawing contours from arbitrary data points. The Computer Journal 17(4): 318–324.

    Google Scholar 

  • Medin, D. L. & Shoben, E. J. (1988). Context and structure in conceptual combination. Cognitive Psychology 20: 158–190.

    Google Scholar 

  • Meese, R. & Wallace, N. (1991). Nonparametric estimation of dynamic hedonic price models and the construction of residential housing price indices. American Real Estate and Urban Economics Association Journal 19(3): 308–332.

    Google Scholar 

  • Meese, R. A. & Rose, A. K. (1990). Nonlinear, nonparametric, nonessential exchange rate estimation. The American Economic Review May: 192–196.

  • Miller, A. J. (1990). Subset Selection in Regression. Chapman and Hall, London.

    Google Scholar 

  • Miller, W. T., Glanz, F. H. & Kraft, L. G. (1987). Application of a general learning algorithm to the control of robotic manipulators. International Journal of Robotics Research 6: 84–98.

    Google Scholar 

  • Mohri, T. & Tanaka, H. (1994). An optimal weighting criterion of case indexing for both numeric and symbolic attributes. In Aha, D. W. (ed.), AAAI-94 Workshop Program: Case-Based Reasoning, Working Notes, pp. 123–127. AAAI Press, Seattle, WA.

    Google Scholar 

  • Moore, A. W. (1990a). Acquisition of Dynamic Control Knowledge for a Robotic Manipulator. In Seventh International Machine Learning Workshop. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Moore, A. W. (1990b). Efficient Memory-based Learning for Robot Control. PhD. Thesis; Technical Report No. 209, Computer Laboratory, University of Cambridge.

  • Moore, A. W., Hill, D. J. & Johnson, M. P. (1992). An empirical investigation of brute force to choose features, smoothers, and function approximators. In Hanson, S., Judd, S. & Petsche, T. (eds.), Computational Learning Theory and Natural Learning Systems, volume 3. MIT Press, Cambridge, MA.

    Google Scholar 

  • Moore, A. W. & Schneider, J. (1995). Memory-based stochastic optimization. To appear in the proceedings of NIPS-95, Also available as Technical Report CMU-RI-TR–95–30, ftp://ftp.cs.cmu.edu/afs/cs.cmu.edu/project/reinforcement/papers/memstoch.ps.

  • More, J. J., Garbow, B. S. & Hillstrom, K. E. (1980). User guide for MINPACK-1. Technical Report ANL–80–74, Argonne National Laboratory, Argonne, Illinois.

    Google Scholar 

  • Müller, H.-G. (1987). Weighted local regression and kernel methods for nonparametric curve fitting. Journal of the American Statistical Association 82: 231–238.

    Google Scholar 

  • Müller, H.-G. (1993). Comment on [Hastie and Loader, 1993]. Statistical Science 8(2): 134–139.

    Google Scholar 

  • Murphy, O. J. & Selkow, S. M. (1986). The efficiency of using k-d trees for finding nearest neighbors in discrete space. Information Processing Letters 23: 215–218.

    Google Scholar 

  • Myers, R. H. (1990). Classical and Modern Regression With Applications. PWS-KENT, Boston, MA.

    Google Scholar 

  • Nadaraya, E. A. (1964). On estimating regression. Theory of Probability and Its Applications 9: 141–142.

    Google Scholar 

  • Næs, T. & Isaksson, T. (1992). Locally weighted regression in diffuse near-infrared transmittance spectroscopy. Applied Spectroscopy 46(1): 34–43.

    Google Scholar 

  • Næs, T., Isaksson, T. & Kowalski, B. R. (1990). Locally weighted regression and scatter correction for near-infrared reflectance data. Analytical Chemistry 62(7): 664–673.

    Google Scholar 

  • Nguyen, T., Czerwinsksi, M. & Lee, D. (1993). COMPAQ Quicksource: Providing the consumer with the power of artificial intelligence. In Proceedings of the Fifth Annual Conference on Innovative Applications of Artificial Intelligence, pp. 142–150, Washington, DC. AAAI Press.

    Google Scholar 

  • Nosofsky, R. M., Clark, S. E. & Shin, H. J. (1989). Rules and exemplars in categorization, identification, and recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition 15: 282–304.

    Google Scholar 

  • Omohundro, S. M. (1987). Efficient Algorithms with Neural Network Behaviour. Journal of Complex Systems 1(2): 273–347.

    Google Scholar 

  • Omohundro, S. M. (1991). Bumptrees for Efficient Function, Constraint, and Classification Learning. In Lippmann, R. P., Moody, J. E. & Touretzky, D. S. (eds.), Advances in Neural Information Processing Systems 3. Morgan Kaufmann.

  • Palmer, J. A. B. (1969). Automatic mapping. In 4th Australian Computer Conference, pp. 463–466.

  • Pelto, C. R., Elkins, T. A. & Boyd, H. A. (1968). Automatic contouring of irregularly spaced data. Geophysics 33: 424–430.

    Google Scholar 

  • Peng, J. (1995). Efficient memory-based dynamic programming. In Prieditis & Russell (eds.) (1995), pp. 438–446.

  • Press, W. H., Teukolsky, S. A., Vetterling, W. T. & Flannery, B. P. (1988). Numerical Recipes in C. Cambridge University Press, New York, NY.

    Google Scholar 

  • Prieditis, A. & Russell, S. (eds.) (1995). Twelfth International Conference on Machine Learning. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Rachlin, J., Kasif, S., Salzberg, S. & Aha, D. W. (1994). Towards a better understanding of memory-based reasoning systems. In Eleventh International Conference on Machine Learning, pp. 242–250. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Racine, J. (1993). An efficient cross-validation algorithm for window width selection for non-parametric kernel regression. Communications in Statistics: Simulation and Computation 22(4): 1107–1114.

    Google Scholar 

  • Ramasubramanian, V. & Paliwal, K. K. (1989). A generalized optimization of the k-d tree for fast nearest-neighbour search. In International Conference on Acoustics, Speech, and Signal Processing.

  • Raz, J., Turetsky, B. I. & Fein, G. (1989). Selecting the smoothing parameter for estimation of smoothly changing evoked potential signals. Biometrics 45: 851–871.

    Google Scholar 

  • Renka, R. J. (1988). Multivariate interpolation of large sets of scattered data. ACM Transactions on Mathematical Software 14(2): 139–152.

    Google Scholar 

  • Ruppert, D. & Wand, M. P. (1994). Multivariate locally weighted least squares regression. The Annals of Statistics 22(3): 1346–1370.

    Google Scholar 

  • Ruprecht, D. & Müller, H. (1992). Image warping with scattered data interpolation methods. Technical Report 443, Universität Dortmund, Fachbereich Informatik, D-44221 Dortmund, Germany. Available for anonymous FTP from ftp-1s7.informatik.uni-dortmund.de in pub/reports/ls7/rr-443.ps.Z.

    Google Scholar 

  • Ruprecht, D. & Müller, H. (1993). Free form deformation with scattered data interpolation methods. In Farin, G., Hagen, H. & Noltemeier, H. (eds.), Geometric Modelling (Computing Suppl. 8), pp. 267–281. Springer Verlag. Available for anonymous FTP from ftp-ls7.informatik.uni-dortmund.de in pub/reports/iif/rr-41.ps.Z.

  • Ruprecht, D. & Müller, H. (1994a). Deformed cross-dissolves for image interpolation in scientific visualization. The Journal of Visualization and Computer Animation 5(3): 167–181. Available for anonymous FTP from ftp-ls7.informatik.uni-dortmund.de in pub/reports/ls7/rr-491.ps.Z.

    Google Scholar 

  • Ruprecht, D. & Müller, H. (1994b). A framework for generalized scattered data interpolation. Technical Report 517, Universität Dortmund, Fachbereich Informatik, D-44221 Dortmund, Germany. Available for anonymous FTP from ftp-ls7.informatik.uni-dortmund.de in pub/reports/ls7/rr-517.ps.Z.

    Google Scholar 

  • Ruprecht, D., Nagel, R. & Müller, H. (1994). Spatial free form deformation with scattered data interpolation methods. Technical Report 539, Fachbereich Informatik der Universität Dortmund, 44221 Dortmund, Germany. Accepted for publication by Computers & Graphics, Available for anonymous FTP from ftp-ls7.informatik.uni-dortmund.de in pub/reports/ls7/rr-539.ps.Z.

    Google Scholar 

  • Rust, R. T. & Bornman, E. O. (1982). Distribution-free methods of approximating nonlinear marketing relationships. Journal of Marketing Research XIX: 372–374.

    Google Scholar 

  • Sabin, M. A. (1980). Contouring — a review of methods for scattered data. In Brodlie, K. (ed.), Mathematical Methods in Computer Graphics and Design, pp. 63–86. Academic Press, New York, NY.

    Google Scholar 

  • Saitta, L. (ed.) (1996). Thirteenth International Conference on Machine Learning. Morgan Kaufmann, San Mateo, CA.

    Google Scholar 

  • Samet, H. (1990). The Design and Analysis of Spatial Data Structures. Addison-Wesley, Reading, MA.

    Google Scholar 

  • Schaal, S. & Atkeson, C. G. (1994). Assessing the quality of learned local models. In Cowan et al. (1994), pp. 160–167.

  • Schaal, S. & Atkeson, C. G. (1995). From isolation to cooperation: An alternative view of a system of experts. NIPS95 proceedings, in press.

  • Scott, D. W. (1992). Multivariate Density Estimation. Wiley, New York, NY.

    Google Scholar 

  • Seber, G. A. F. (1977). Linear Regression Analysis. John Wiley, New York, NY.

    Google Scholar 

  • Seifert, B., Brockmann, M., Engel, J. & Gasser, T. (1994). Fast algorithms for nonparametric curve estimation. Journal of Computational and Graphical Statistics 3(2): 192–213.

    Google Scholar 

  • Seifert, B. & Gasser, T. (1994). Variance properties of local polynomials. http://www.unizh.ch/biostat/manuscripts.html.

  • Shepard, D. (1968). A two-dimensional function for irregularly spaced data. In 23rd ACM National Conference, pp. 517–524.

  • Solow, A. R. (1988). Detecting changes through time in the variance of a long-term hemispheric temperature record: An application of robust locally weighted regression. Journal of Climate 1: 290–296.

    Google Scholar 

  • Specht, D. E. (1991). A general regression neural network. IEEE Transactions on Neural Networks 2(6): 568–576.

    Google Scholar 

  • Sproull, R. F. (1991). Refinements to nearest-neighbor searching in k-d trees. Algorithmica 6: 579–589.

    Google Scholar 

  • Stanfill, C. (1987). Memory-based reasoning applied to English pronunciation. In Sixth National Conference on Artificial Intelligence, pp. 577–581.

  • Stanfill, C. & Waltz, D. (1986). Toward memory-based reasoning. Communications of the ACM 29(12): 1213–1228.

    Google Scholar 

  • Steinbuch, K. (1961). Die lernmatrix. Kybernetik 1: 36–45.

    Google Scholar 

  • Steinbuch, K. & Piske, U. A. W. (1963). Learning matrices and their applications. IEEE Transactions on Electronic Computers EC-12: 846–862.

    Google Scholar 

  • Stone, C. J. (1975). Nearest neighbor estimators of a nonlinear regression function. In Computer Science and Statistics: 8th Annual Symposium on the Interface, pp. 413–418.

  • Stone, C. J. (1977). Consistent nonparametric regression. The Annals of Statistics 5: 595–645.

    Google Scholar 

  • Stone, C. J. (1980). Optimal rates of convergence for nonparametric estimators. The Annals of Statistics 8: 1348–1360.

    Google Scholar 

  • Stone, C. J. (1982). Optimal global rates of convergence for nonparametric regression. The Annals of Statistics 10(4): 1040–1053.

    Google Scholar 

  • Sumita, E., Oi, K., Furuse, O., Iida, H., Higuchi, T., Takahashi, N. & Kitano, H. (1993). Example-based machine translation on massively parallel processors. In IJCAI 13 (1993), pp. 1283–1288.

    Google Scholar 

  • Tadepalli, P. & Ok, D. (1996). Scaling up average reward reinforcement learning by approximating the domain models and the value function. In Saitta (1996). http://www.cs.orst.edu:80/∼tadepall/research/publications.html.

  • Tamada, T., Maruyama, M., Nakamura, Y., Abe, S. & Maeda, K. (1993). Water demand forecasting by memory based learning. Water Science and Technology 28(11–12): 133–140.

    Google Scholar 

  • Taylor, W. K. (1959). Pattern recognition by means of automatic analogue apparatus. Proceedings of The Institution of Electrical Engineers 106B: 198–209.

    Google Scholar 

  • Taylor, W. K. (1960). A parallel analogue reading machine. Control 3: 95–99.

    Google Scholar 

  • Thorpe, S. (1995). Localized versus distributed representations. In Arbib, M. A. (ed.), The Handbook of Brain Theory and Neural Networks, pp. 549–552. The MIT Press, Cambridge, MA.

    Google Scholar 

  • Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In Advances in Neural Information Processing Systems (NIPS) 8. http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/thrun/publications.html.

  • Thrun, S. & O'Sullivan, J. (1996). Discovering structure in multiple learning tasks: The TC algorithm. In Saitta (1996). http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/thrun/publications.html.

  • Tibshirani, R. & Hastie, T. (1987). Local likelihood estimation. Journal of the American Statistical Association 82: 559–567.

    Google Scholar 

  • Ting, K. M. & Cameron-Jones, R. M. (1994). Exploring a framework for instance based learning and naive Bayesian classifiers. In Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence, Armidale, Australia. World Scientific.

    Google Scholar 

  • Tou, J. T. & Gonzalez, R. C. (1974). Pattern Recognition Principles. Addison-Wesley, Reading, MA.

    Google Scholar 

  • Townshend, B. (1992). Nonlinear prediction of speech signals. In Casdagli and Eubank (1992), pp. 433–453. Proceedings of a Workshop on Nonlinear Modeling and Forecasting September 17–21, 1990, Santa Fe, New Mexico.

  • Tsybakov, A. B. (1986). Robust reconstruction of functions by the local approximation method. Problems of Information Transmission 22: 133–146.

    Google Scholar 

  • Tukey, J. (1977). Exploratory Data Analysis. Addison-Wesley, Reading, MA.

    Google Scholar 

  • Turetsky, B. I., Raz, J. & Fein, G. (1989). Estimation of trial-to-trial variation in evoked potential signals by smoothing across trials. Psychophysiology 26(6): 700–712.

    Google Scholar 

  • Turlach, B. A. & Wand, M. P. (1995). Fast computation of auxiliary quantities in local polynomial regression. http://netec.wustl.edu/∼adnetec/WoPEc/agsmst/agsmst95009.html.

  • van der Smagt, P., Groen, F. & van het Groenewoud, F. (1994). The locally linear nested network for robot manipulation. In Proceedings of the IEEE International Conference on Neural Networks, pp. 2787–2792. ftp://ftp.fwi.uva.nl/pub/computer-systems/aut-sys/reports/SmaGroGro94b.ps.gz.

  • Vapnik, V. (1992). Principles of risk minimization for learning theory. In Moody, J. E., Hanson, S. J. & Lippmann, R. P. (eds.), Advances In Neural Information Processing Systems 4, pp. 831–838. Morgan Kaufman, San Mateo, CA.

    Google Scholar 

  • Vapnik, V. & Bottou, L. (1993). Local algorithms for pattern recognition and dependencies estimation. Neural Computation 5(6): 893–909.

    Google Scholar 

  • Walden, A. T. & Prescott, P. (1983). Identification of trends in annual maximum sea levels using robust locally weighted regression. Estuarine, Coastal and Shelf Science 16: 17–26.

    Google Scholar 

  • Walters, R. F. (1969). Contouring by machine: A user's guide. American Association of Petroleum Geologists Bulletin 53(11): 2324–2340.

    Google Scholar 

  • Waltz, D. L. (1987). Applications of the Connection Machine. Computer 20(1): 85–97.

    Google Scholar 

  • Wand, M. P. & Jones, M. C. (1993). Comparison of smoothing parameterizations in bivariate kernel density estimation. Journal of the American Statistical Association 88: 520–528.

    Google Scholar 

  • Wand, M. P. & Jones, M. C. (1994). Kernel Smoothing. Chapman and Hall, London.

    Google Scholar 

  • Wand, M. P. & Schucany, W. R. (1990). Gaussian-based kernels for curve estimation and window width selection. Canadian Journal of Statistics 18: 197–204.

    Google Scholar 

  • Wang, Z., Isaksson, T. & Kowalski, B. R. (1994). New approach for distance measurement in locally weighted regression. Analytical Chemistry 66(2): 249–260.

    Google Scholar 

  • Watson, G. S. (1964). Smooth regression analysis. Sankhyā: The Indian Journal of Statistics, Series A, 26: 359–372.

    Google Scholar 

  • Weisberg, S. (1985). Applied Linear Regression. John Wiley and Sons.

  • Wess, S., Althoff, K.-D. & Derwand, G. (1994). Using k-d trees to improve the retrieval step in case-based reasoning. In Wess, S., Althoff, K.-D. & Richter, M. M. (eds.), Topics in Case-Based Reasoning, pp. 167–181. Springer-Verlag, New York, NY. Proceedings of the First European Workshop, EWCBR-93.

    Google Scholar 

  • Wettschereck, D. (1994). A Study Of Distance-Based Machine Learning Algorithms. PhD dissertation, Oregon State University, Department of Computer Science, Corvalis, OR.

    Google Scholar 

  • Wijnberg, L. & Johnson, T. (1985). Estimation of missing values in lead air quality data sets. In Johnson, T. R. & Penkala, S. J. (eds.), Quality Assurance in Air Pollution Measurements. Air Pollution Control Association, Pittsburgh, PA. TR-3: Transactions: An APCA International Specialty Conference.

    Google Scholar 

  • Wolberg, G. (1990). Digital Image Warping. IEEE Computer Society Press, Los Alamitos, CA.

    Google Scholar 

  • Yasunaga, M. & Kitano, H. (1993). Robustness of the memory-based reasoning implemented by wafer scale integration. IEICE Transactions on Information and Systems E76-D(3): 336–344.

    Google Scholar 

  • Zografski, Z. (1989). Neuromorphic, Algorithmic, and Logical Models for the Automatic Synthesis of Robot Action. PhD dissertation, University of Ljubljana, Ljubljana, Slovenia, Yugoslavia.

    Google Scholar 

  • Zografski, Z. (1991). New methods of machine learning for the construction of integrated neuromorphic and associative-memory knowledge bases. In Zajc, B. & Solina, F. (eds.), Proceedings, 6th Mediterranean Electrotechnical Conference, volume II, pp. 1150–1153, Ljubljana, Slovenia, Yugoslavia. IEEE catalog number 91CH2964–5.

  • Zografski, Z. (1992). Geometric and neuromorphic learning for nonlinear modeling, control and forecasting. In Proceedings of the 1992 IEEE International Symposium on Intelligent Control, pp. 158–163, Glasgow, Scotland. IEEE catalog number 92CH3110–4.

  • Zografski, Z. & Durrani, T. (1995). Comparing predictions from neural networks and memory-based learning. In Proceedings, ICANN '95/NEURONIMES '95: International Conference on Artificial Neural Networks, pp. 221–226, Paris, France.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Atkeson, C.G., Moore, A.W. & Schaal, S. Locally Weighted Learning. Artificial Intelligence Review 11, 11–73 (1997). https://doi.org/10.1023/A:1006559212014

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1006559212014

Navigation