The Australian National University
The Fenner School of Environment and Society
gmap icon pic
Search the
Fenner School:

ANNOUNCEMENTS
2010 Summer Courses

Scholarships Available
To all Fenner students


Enrolling NOW
Honours 2010

Fenner Courses Offered
2010 courses
 

M.F. Hutchinson

Contents

Introduction

The aim of the ANUSPLIN package is to provide a facility for transparent analysis and interpolation of noisy multi-variate data using thin plate smoothing splines. The package supports this aim by providing comprehensive statistical analyses, data diagnostics and spatially distributed standard errors. It also supports flexible data input and surface interrogation procedures.

The original thin plate (formerly Laplacian) smoothing spline surface fitting technique was described by Wahba (1979), with modifications for larger data sets due to Bates and Wahba (1982), Elden (1984), Hutchinson (1984) and Hutchinson and de Hoog (1985). The extension to partial splines is based on Bates et al. (1987). This allows for the incorporation of parametric linear sub-models (or covariates), in addition to the independent spline variables. This is a robust way of allowing for additional dependencies, provided a parametric form for these dependencies can be determined. In the limiting case of no independent spline variables (not currently permitted), the procedure would become simple multi-variate linear regression.

Thin plate smoothing splines can in fact be viewed as a generalisation of standard multi-variate linear regression, in which the parametric model is replaced by a suitably smooth non-parametric function. The degree of smoothness, or inversely the degree of complexity, of the fitted function is usually determined automatically from the data by minimising a measure of predictive error of the fitted surface given by the generalised cross validation (GCV). Theoretical justification of the GCV and demonstration of its performance on simulated data have been given by Craven and Wahba (1979).

A comprehensive introduction to the technique of thin plate smoothing splines, with various extensions, is given in Wahba (1990). A brief overview of the basic theory and applications to spatial interpolation of monthly mean climate is given in Hutchinson (1991a). More comprehensive discussion of the algorithms and associated statistical analyses, and comparisons with kriging, are given in Hutchinson (1993) and Hutchinson Gessler (1994). Recent applications to annual and daily precipitation data have been described by Hutchinson (1995, 1998ab).

It is often convenient, particularly when processing climate data, to process several surfaces simultaneously. ANUSPLIN now allows for arbitrarily many such surfaces and introduces the concept of "surface independent variables", so that independent variables may change systematically from surface to surface. ANUSPLIN permits systematic interrogation of these surfaces, and their standard errors, in both point and grid form. ANUSPLIN also permits transformations of both independent and dependent variables.

A brief summary of the nine programs which make up the ANUSPLIN package is tabulated in the following section, accompanied by a flow chart showing the main connections between the programs. This is followed by detailed documentation for each program in the package. The User Guide concludes with a comprehensive discussion of example smoothing spline analyses of uni-variate data and multi-variate climate data. The data supporting these analyses are supplied with the package. These analyses can be used as a tutorial on the basic concepts of data smoothing, with particular applications to the spatial interpolation of climate.

Program Summary

PROGRAM

DESCRIPTION

SPLINA

A program which fits an arbitrary number of (partial) thin plate smoothing spline functions of one or more independent variables. Suitable for data sets with up to about 2000 points although data sets can have arbitrarily many points. The degree of data smoothing is normally determined by minimising the generalised cross validation (GCV) or the generalised maximum likelihood (GML) of the fitted surface.

SPLINB

An approximate version of SPLINA designed for larger data sets. It uses knots which are initially selected by SELNOT and updated by ADDNOT. Suitable for data sets with up to about 10,000 data points, with up to about 2000 knots, although data sets can have arbitrarily many points.

SELNOT

Selects an initial set of knots for use by SPLINB.

ADDNOT

Updates knot index file when additional knots are selected from the ranked residual list produced by SPLINB.

DELNOT

Adjusts knot index file when points are removed from the data file to be used by SPLINB.

GCVGML

Calculates the GCV or GML for each surface and the average GCV or GML for a range of values of the smoothing parameter for surfaces fitted by SPLINA. It can be applied to surfaces fitted by SPLINA or SPLINB. The values are written to a file for inspection and for plotting.

LAPPNT

Calculates values, and Bayesian standard error estimates, of partial thin plate smoothing spline surfaces at points supplied in a file.

LAPGRD

Calculates values, and Bayesian standard error estimates, of partial thin plate smoothing spline surfaces on a regular rectangular grid.

Main Data Flows

 

The flow chart below shows the main data flows through the programs described in the program summary. The overall analysis proceeds from point data to output point and grid files suitable for storage and plotting by a geographic information system (GIS) and other plotting packages. The analyses by SPLINA and SPLINB provide up to six output files which provide statistical analyses, support detection of data errors, an important phase of the analysis, and facilitate determination of additional knots by ADDNOT for the SPLINB program. The output surface coefficients and error covariance matrices enable systematic interrogation of the fitted surfaces by LAPPNT and LAPGRD. The GCV files output by AVGCVA and AVGCVB can also assist detection of data errors and revision of the specifications of the spline model.

 

 

 

 

 

Back to top

Fitting Climate Surfaces

The surface fitting procedure was primarily developed for this task so that there are normally at least two independent spline variables, longitude and latitude, in this order and in units of decimal degrees. A third independent variable, elevation above sea-level, is often appropriate when fitting surfaces to temperature or precipitation. This is normally included as a third independent spline variable, in which case it should be scaled to be in units of kilometres. Minor improvements can sometimes be had by slightly altering this scaling of elevation. This scaling was originally determined by Hutchinson and Bischof (1983) and has been verified by Hutchinson (1995, 1998b).

Over restricted areas, superior performance can sometimes be achieved by including elevation not as an independent spline variable but as an independent covariate. Thus, in the case of fitting a temperature surface, the coefficient of an elevation covariate would be an empirically determined temperature lapse rate (Hutchinson, 1991a). Other factors which influence the climate variable may be included as additional covariates if appropriate parameterizations can be determined and the relevant data are available. These might include, for example, topographic effects other than elevation above sea-level. Other applications to climate interpolation have been described by Hutchinson et al. (1984ab, 1996a) and Hutchinson (1989a, 1991ab). Applications of fitted spline climate surfaces to global agroclimatic classifications and to the assessment of biodiversity are described by Hutchinson et al. (1992, 1996b).

To fit multi-variate climate surfaces, the values of the independent variables need only be known at the data points. Thus meteorological stations should be accurately located in position and elevation. Errors in these locations are often indicated by large values in the output ranked residual list. Recent applications have examined the utility of using elevation and slope and aspect obtained from digital elevation models of various horizontal resolutions (Hutchinson 1995, 1998b).

The LAPGRD program can be used to calculate a regular grids of fitted climate values and their standard errors, for mapping and other purposes, provided a regular grid of values of each independent variable, additional to longitude and latitude, is supplied. This usually means that a regular grid digital elevation model (DEM) is required. A technique for calculating such DEMs from elevation and stream line data has been described by Hutchinson (1988, 1989b, 1996).

Back to top

New Features of ANUSPLIN Version 4.3

  • Spatially distributed standard errors now available for surfaces fitted by SPLINB
  • Data files with missing values supported
  • Improved knot selection algorithm for SELNOT
  • Smoothing can be determined by minimising generalised cross-validation (GCV) or generalised maximum likelihood (GML)
  • AVGCVA and AVGCVB rolled into a single program now called GCVGML

New Features of ANUSPLIN Version 4.2

  • Minor clarifications to documentation on user directives.
  • Expand syntax of LAPGRD to permit reading of grids of values for all independent variables, including the two grid independent variables.
  • Operation of constant smoothing parameter option corrected in SPLINA and SPLINB.
  • Calculation of x coordinates corrected in LAPGRD when writing output grids in xyz format.
  • Summary statistics in log file corrected in LAPPNT.
  • Output surface values corrected for bias in SPLINA, SPLINB, LAPPNT and LAPGRD when using a dependent variable transformation (square root or natural logarithm).
  • Operation of power transformation for independent variables corrected in SPLINA, SPLINB, LAPPNT and LAPGRD.
  • Generic FORTRAN 90 routines used for basic linear algebra operations.
  • Improved detection by LAPGRD of input binary grid files.

Back to top

New Features of ANUSPLIN Version 4.0

  • The former SPLINAA and SPLINA programs have been rolled into a single SPLINA program.
  • The former SPLINBB and SPLINB programs have been rolled into a single SPLINB program.
  • The former ERRGRD and LAPGRD programs have rolled into a single LAPGRD program which allows calculation of multiple surface grids in one run.
  • The former ERRPNT and LAPPNT programs have been rolled into a single LAPPNT program, which has always allowed calculation of multiple surface values in one run.
  • LAPGRD and LAPPNT now calculate surface values and/or standard error values and/or 95% confidence intervals. The standard error values and confidence intervals can be calculated if the error covariance matrix has been calculated with SPLINA.
  • User prompts for most programs have been simplified.
  • No limits on number of data points or number of knots.
  • The concept of "surface" independent variables has been introduced, e.g. fitting monthly mean solar radiation as function of longitude, latitude and (transformed) monthly mean rainfall. These variables are permitted for all programs in ANUSPLIN.
  • Anisotropic transformation of two independent variables is now supported.
  • On line transformations (natural logarithm and square root) of dependent variables is now supported, with accompanying extended statistical analysis, including standard errors of the back-transformed surface values.
  • A bug in the calculation of standard errors for multiple surfaces has been corrected.

  • The LAPGRD and LAPPNT programs have been sped up by about a factor of two. The same correction has led to a modest speed up of SPLINA and SPLINB.

Future plans

  • Calculation of partial derivatives of fitted spline functions.
  • Capacity to fit additive spline models.
  • Additional on line transformations of dependent variables.

Reading

Bates D and Wahba G. 1982. Computational methods for generalised cross validation with large data sets. In: Baker CTH and Miller GF (eds). Treatment of Integral Equations by Numerical Methods. New York: Academic Press: 283-296.

Bates D, Lindstrom M, Wahba G and Yandell B. 1987. GCVPACK - routines for generalised cross validation. Commun. Statist. B - Simulation and Computation 16: 263-297.

Craven P and Wahba G. 1979. Smoothing noisy data with spline functions. Numerische Mathematik 31: 377-403.

Dongarra, JJ., Moler, CB., Bunch, JR. and Stewart GW. 1979. LINPACK Users' Guide. SIAM, Philadelphia.

Elden L. 1984. A note on the computation of the generalised cross-validation function for ill-conditioned least squares problems. BIT 24: 467-472.

Hutchinson MF. 1984. A summary of some surface fitting and contouring programs for noisy data. CSIRO Division of Mathematics and Statistics, Consulting Report ACT 84/6. Canberra, Australia.

Hutchinson MF. 1988. Calculation of hydrylogically sound digital elevation models. Third International Symposium on Spatial Data Handling. Columbus, Ohio: International Geographical Union: 117-133.

Hutchinson MF. 1989a. A new objective method for spatial interpolation of meteorological variables from irregular networks applied to the estimation of monthly mean solar radiation, temperature, precipitation and windrun. CSIRO Division of Water Resources Tech. Memo. 89/5: 95-104.

Hutchinson MF. 1989b. A new procedure for gridding elevation and stream line data with automatic removal of spurious pits. Journal of Hydrology 106: 211-232.

Hutchinson MF. 1991a. The application of thin plate smoothing splines to continent-wide data assimilation. In:. Jasper JD (ed.) BMRC Research Report No.27, Data Assimilation Systems. Melbourne: Bureau of Meteorology: 104-113.

Hutchinson MF. 1991b. Climatic analyses in data sparse regions. In:. Muchow RC and. Bellamy JA (eds). Climatic Risk in Crop Production, CAB International, 55-71.

Hutchinson MF. 1993. On thin plate splines and kriging. In: Tarter ME and Lock MD.(eds). Computing and Science in Statistics 25. University of California, Berkeley: Interface Foundation of North America: 55-62.

Hutchinson MF. 1995. Interpolating mean rainfall using thin plate smoothing splines. International Journal of GIS 9: 305-403.

Hutchinson MF. 1996. A locally adaptive approach to the interpolation of digital elevation models. Third Conference/Workshop on Integrating GIS and Environmental Modeling. Santa Barbara: NCGIA, University of California. http://www.ncgia.ucsb.edu/conf/SANTA_FE_CD-ROM/santa_fe.html .

Hutchinson MF. 1998a. Interpolation of rainfall data with thin plate smoothing splines: I two dimensional smoothing of data with short range correlation. Journal of Geographic Information and Decision Analysis 2(2): 152-167. http://publish.uwo.ca/~jmalczew/gida_4.htm

Hutchinson MF. 1998b. Interpolation of rainfall data with thin plate smoothing splines: II analysis of topographic dependence. Journal of Geographic Information and Decision Analysis 2(2): 168-185. http://publish.uwo.ca/~jmalczew/gida_4.htm

Hutchinson MF. and Bishof RJ. 1983. A new method for estimating the spatial distribution of mean seasonal and annual rainfall applied to the Hunter Valley, New South Wales. Australian Meteorological Magazine 31: 179-184.

Hutchinson MF, Booth TH, Nix HA and McMahon JP. 1984a. Estimating monthly mean values of daily total solar radiation for Australia. Solar Energy 32: 277-290.

Hutchinson MF., Kalma JD and Johnson ME. 1984b. Monthly estimates of wind speed and wind run for Australia. Journal of Climatology 4: 311-324.

Hutchinson MF. and de Hoog FR. 1985. Smoothing noisy data with spline functions. Numerische Mathematik 47: 99-106.

Hutchinson MF. Nix HA. and McMahon JP. 1992. Climate constraints on cropping systems. In: Pearson CJ. (ed), Ecosystems of the World, 18 Field Crop Ecosystems. Amsterdam: Elsevier: 37-58.

Hutchinson MF. and Gessler PE. 1994. Splines - more than just a smooth interpolator. Geoderma 62: 45-67.

Hutchinson MF. Nix HA, McMahon JP. and Ord KD. 1996a. The development of a topographic and climate database for Africa. In: Proceedings of the Third International Conference/Workshop on Integrating GIS and Environmental Modeling, NCGIA, Santa Barbara, California. http://www.ncgia.ucsb.edu/conf/SANTA_FE_CD-ROM/santa_fe.html

Hutchinson MF., Belbin L., Nicholls AO., Nix HA., McMahon J.P. and Ord KD. 1996b. Rapid Assessment of Biodiversity, Volume Two, Spatial Modelling Tools. The Australian BioRap Consortium, Australian National University, 142pp.

Kesteven JL. and Hutchinson MF. 1996. Spatial modelling of climatic variables on a continental scale. In: Proceedings of the Third International Conference/Workshop on Integrating GIS and Environmental Modeling, NCGIA, Santa Barbara, California. http://www.ncgia.ucsb.edu/conf/SANTA_FE_CD-ROM/santa_fe.html

Schimek, M.G. (ed.) 2000. Smoothing and regression: approaches, computation and application. John Wiley & Sons. New York.

Silverman BW. 1985. Some aspects of the spline smoothing approach to nonparametric regression curve fitting (with discussion). Journal Royal Statistical Society Series B 47: 1-52.

Wahba G. 1979. How to smooth curves and surfaces with splines and cross-validation. Proc. 24th Conference on the Design of Experiments. US Army Research Office 79-2, Research Triangle Park, NC: 167-192.

Wahba G. 1983. Bayesian confidence intervals for the cross-validated smoothing spline. Journal Royal Statistical Society Series B 45: 133-150.

Wahba G. 1990. Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics 59, SIAM, Philadelphia, Pennsylvania.


For questions relating to this software contact: Michael.Hutchinson@anu.edu.au

Copyright | Disclaimer | Privacy | Contact ANU

Title:
URL:
Page last updated:
Author:

The Australian National University — CRICOS Provider Number 00120C