Sensitivity analysis

From wiki.gis.com
Jump to: navigation, search

Sensitivity analysis (SA) is the study of how the variation (uncertainty) in the output of a mathematical model can be apportioned, qualitatively or quantitatively, to different sources of variation in the input of a model [1].

In more general terms uncertainty and sensitivity analyses investigate the robustness of a study when the study includes some form of mathematical modelling. Sensitivity analysis measures the change in model output when the input variables are manipulated. This helps understand how much error (or output change)the model produces for each change in the inputs and deepens understanding of the relationships between input and output variables in the model. Uncertainty analysis studies the the amount of change identified in the model with each input of uncertainty (identified depending on the type of model). While uncertainty analysis studies the overall uncertainty in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions. For example, several guidelines for modelling (see one from the US EPA) or for impact assessment (see one from the European Commission) prescribe sensitivity analysis as a tool to ensure the quality of the modeling/assessment.

It should not be expected that all outputs are correct preconceptions of their relationships with inputs in a model. Sensitivity analysis requires error checking to verify correct correspondence between inputs and outputs. Moreover, model simplification is crucial to the efficiency of a mathematical model. Fixing inputs that have no influence over outputs and removing redundant parts of the model structure contribute to its accuracy.


Overview

Mathematical problems met in social, economic or natural sciences may entail the use of mathematical models, which generally do not lend themselves to a straightforward understanding of the relationship between input factors (what goes into the model) and output (the model’s dependent variables). Such an appreciation, i.e. the understanding of how the model behaves in response to changes in its inputs, is of fundamental importance to ensure a correct use of the models.

A mathematical model is defined by a series of equations, input factors, parameters, and variables aimed to characterize the process being investigated.

Input is subject to many sources of uncertainty including errors of measurement, absence of information and poor or partial understanding of the driving forces and mechanisms. This uncertainty imposes a limit on our confidence in the response or output of the model. Further, models may have to cope with the natural intrinsic variability of the system, such as the occurrence of stochastic events.

Good modeling practice requires that the modeler provides an evaluation of the confidence in the model, possibly assessing the uncertainties associated with the modeling process and with the outcome of the model itself. Uncertainty and Sensitivity Analysis offer valid tools for characterizing the uncertainty associated with a model. Uncertainty analysis (UA) quantifies the uncertainty in the outcome of a model. Sensitivity Analysis has the complementary role of ordering by importance the strength and relevance of the inputs in determining the variation in the output. Often the uncertainty is measured by plotting the model outputs and applying a variety of statistical analysis such as regression, and correlation.

In models involving many input variables sensitivity analysis is an essential ingredient of model building and quality assurance. With thorough sensitivity analysis, a study is done on each input variable. This helps the modeler understand the sensitivity of the model to each individual input parameter. National and international agencies involved in impact assessment studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the European Commission, the White House Office of Management and Budget, the Intergovernmental Panel on Climate Change and the US Environmental Protection Agency.

The problem setting in sensitivity analysis has strong similarities with design of experiments. In design of experiments one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.

In uncertainty and sensitivity analysis there is a crucial trade off between how scrupulous an analyst is in exploring the input assumptions and how wide the resulting inference may be. The point is well illustrated by the econometrician Edward E. Leamer (1990) [2]:

I have proposed a form of organized sensitivity analysis that I call ‘global sensitivity analysis’ in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.

Note Leamer’s emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when uncertainties in inputs must be suppressed lest outputs become indeterminate.[3]

In modern econometrics the use of sensitivity analysis to anticipate criticism is the subject of one of the ten commandments of applied econometrics (from Kennedy, 2007[4] ):

Thou shall confess in the presence of sensitivity. Corollary: Thou shall anticipate criticism [···] When reporting a sensitivity analysis, researchers should explain fully their specification search so that the readers can judge for themselves how the results may have been affected. This is basically an ‘honesty is the best policy’ approach, advocated by Leamer, (1978[5]).

The use of mathematical modelling can be the subject of controversies, see Nassim Nicholas Taleb[6] in Economics, and Orrin H. Pilkey and Linda Pilkey Jarvis[7] in Environmental Sciences. As noted by the latter Authors, this increases the relevance of sensitivity analysis in today's modelling practice[1] .

Methodology

Sampling-based sensitivity analysis by scatterplots. Y (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by Z1, Z2, Z3, Z4 in turn. Which factor among Z1, Z2, Z3, Z4 is most important in influencing Y? Note that the abscissa is different for each plot: (−5, +5) for Z1, (−8, +8) for Z2, (−10, +10) for Z3 and Z4. Clue: The most important factor is the one which imparts more 'shape' on Y.

There are several possible procedures to perform uncertainty (UA) and sensitivity analysis (SA). Important classes of methods are:

  • Local methods, such as the simple derivative of the output  Y with respect to an input factor  X_i : \left| \frac{\partial Y}{\partial X_i} \right |_{\textbf {x}^0 }, where the subscript  \textbf {x}^0 indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling[8][9] and Automated Differentiation[10] are methods in this class.
  • A sampling[11]-based sensitivity is one in which the model is executed repeatedly for combinations of values sampled from the distribution (assumed known) of the input factors. Once the sample is generated, several strategies (including simple input-output scatterplots) can be used to derive sensitivity measures for the factors.
  • Methods based on emulators (e.g. Bayesian[12]). With these methods the value of the output  Y , or directly the value of the sensitivity measure of a factor  X_i , is treated as a stochastic process and estimated from the available computer-generated data points. This is useful when the computer program which describes the model is expensive to run.
  • Screening methods. This is a particular instance of sampling based methods. The objective here is to estimate a few active factors in models with many factors[13][14].
  • Variance based methods[15][16][17]. Here the unconditional variance  V(Y) of  Y is decomposed into terms due to individual factors plus terms due to interaction among factors. Full variance decompositions are only meaningful when the input factors are independent from one another[18].
  • High Dimensional Model Representations (HDMR)[19][20]. The term is due to H. Rabitz[21] and include as a particular case the variance based methods. In HDMR the output  Y is expressed as a linear combination of terms of increasing dimensionality.
  • Methods based on Monte Carlo filtering[22][23]. These are also sampling-based and the objective here is to identify regions in the space of the input factors corresponding particular values (e.g. high or low) of the output.


Often (e.g. in sampling-based methods) UA and SA are performed jointly by executing the model repeatedly for combination of factor values sampled with some probability distribution. The following steps can be listed:

  • Specify the target function of interest.
    • It is easier to communicate the results of a sensitivity analysis when the target of interest has a direct relation to the problem tackled by the model.
  • Assign a probability density function to the selected factors.
    • When this involves eliciting experts' opinion this is the most expensive and time consuming part of the analysis.
  • Generate a matrix of inputs with that distribution(s) through an appropriate design.
    • As in experimental design, a good design for numerical experiments[24] should give a maximum of effects with a minimum of computed points.
  • Evaluate the model and compute the distribution of the target function.
    • This is the computer-time intensive step.
  • Select a method for assessing the influence or relative importance of each input factor on the target function.
    • This depends upon the purpose of the analysis, e.g. model simplification, factor prioritization, uncertainty reduction, etc.

Errors

In sensitivity analysis Type I error is assessing as important a non important factor, and Type II error assessing as non important an important factor. Type III error corresponds to analysing the wrong problem, e.g. via an incorrect specification of the input uncertainties. Possible pitfalls in sensitivity analysis are:

  • Unclear purpose of the analysis. Different statistical tests and measures are applied to the problem and different factors rankings are obtained. The test should instead be tailored to the purpose of the analysis, e.g. one uses Monte Carlo filtering if one is interested in which factors are most responsible for generating high/low values of the output.
  • Too many model outputs are considered. This may be acceptable for quality assurance of sub-models but should be avoided when presenting the results of the overall analysis.
  • Piecewise sensitivity. This is when one performs sensitivity analysis on one sub-model at a time. This approach is non conservative as it might overlook interactions among factors in different sub-models (Type II error).

The OAT paradox

Moving one step at a time first along one axis and then along the other one does not move outside the circle. If the square has side equal to one, the area of the circle is π(1/2)2 ~ 0.78. In three dimensions this would be (4π/3)(1/2)3 ~ 0.52 and so on, see next Figure.

In sensitivity analysis a common approach is that of changing one-factor-at-a-time (OAT), to see what effect this produces on the output. This appears a logical approach as any change observed in the output will unambiguously be due to the single factor changed. Furthermore by changing one factor at a time one can keep all other factors fixed to their central or baseline value. This increases the comparability of the results (all ‘effects’ are computed with reference to the same central point in space) and minimizes the chances of computer programme crashes, more likely when several input factors are changed simultaneously. The later occurrence is particularly annoying to modellers as in this case one does not know which factor's variation caused the model to crash.

The paradox is that this approach, apparently sound, is non-explorative, with exploration decreasing rapidly with the number of factors. With two factors, and hence in two dimensions, the OAT explores (partially) a circle instead of the full square (see figure).


In k dimensions, the volume of the hyper-sphere included into (and tangent to) the unitary hyper-cube divided that of the hyper-cube itself, goes rapidly to zero (e.g. it is less than 1% already for k = 10, see Figure). Note also that t all OAT point are at most at a distance one from the origin by design. Given that the diagonal of the hypercube is \sqrt{k} in k dimensions, if the points are distributed randomly there will be points (in the corners) which are distant from the origin \frac{\sqrt{k}}{2}. In ten dimensions there are 2^k=1024 corners.

Of course when one throws a handful of points in a multidimensional space these points will be sparse, and in no way the space will be fully explored. Still, even if one has only a handful of points at one's disposal, there is no reason why one should concentrate all these points close to the origin.

Applications

Sensitivity analysis can be used

  • To simplify models
  • To investigate the robustness of the model predictions
  • To play what-if analysis exploring the impact of varying input assumptions and scenarios
  • As an element of quality assurance (unexpected factors sensitivities may be associated to coding errors or misspecifications).

It provides as well information on:

  • Factors that mostly contribute to the output variability
  • The region in the space of input factors for which the model output is either maximum or minimum or within pre-defined bounds (see Monte Carlo filtering above)
  • Optimal — or instability — regions within the space of factors for use in a subsequent calibration study
  • Interaction between factors

Sensitivity Analysis is common in physics and chemistry[25], in financial applications, risk analysis, signal processing, neural networks and any area where models are developed. Sensitivity analysis can also be used in model-based policy assessment studies . Sensitivity analysis can be used to assess the robustness of composite indicators [26], also known as indices, such as the Environmental Pressure Index.

Environmental

Environmental sensitivity map developed by the National Defence Research Institute, FOA-Risk, in Umeå.[27]


Computer environmental models are increasingly used in a wide variety of studies and applications. For example global climate model are used for both short term weather forecasts and long term climate change.

Moreover, computer models are increasingly used for environmental decision making at a local scale, for example for assessing the impact of a waste water treatment plant on a river flow, or for assessing the behavior and life length of bio-filters for contaminated waste water.

In both cases sensitivity analysis may help understanding the contribution of the various sources of uncertainty to the model output uncertainty and system performance in general. In these cases, depending on model complexity, different sampling strategies may be advisable and traditional sensitivity indexes have to be generalized to cover multivariate sensitivity analysis, heteroskedastic effects and correlated inputs.

Business

An example of a business use of sensitivity analysis.[28]

In a decision problem, the analyst may want to identify cost drivers as well as other quantities for which we need to acquire better knowledge in order to make an informed decision. On the other hand, some quantities have no influence on the predictions, so that we can save resources at no loss in accuracy by relaxing some of the conditions. See Corporate finance: Quantifying uncertainty. Sensitivity analysis can help in a variety of other circumstances which can be handled by the settings illustrated below:

  • to identify critical assumptions or compare alternative model structures
  • guide future data collections
  • detect important criteria
  • optimize the tolerance of manufactured parts in terms of the uncertainty in the parameters
  • optimize resources allocation
  • model simplification or model lumping, etc.

However there are also some problems associated with sensitivity analysis in the business context:

  • Variables are often interdependent, which makes examining them each individually unrealistic, e.g.: changing one factor such as sales volume, will most likely affect other factors such as the selling price.
  • Often the assumptions upon which the analysis is based are made by using past experience/data which may not hold in the future.
  • Assigning a maximum and minimum (or optimistic and pessimistic) value is open to subjective interpretation. For instance one persons 'optimistic' forecast may be more conservative than that of another person performing a different part of the analysis. This sort of subjectivity can adversely affect the accuracy and overall objectivity of the analysis.

Dissemination

Dissemination is done by the Joint Research Centre of the European Commission via summer schools, conferences and training courses. See: http://sensitivity-analysis.jrc.ec.europa.eu

Sensitivity Analysis in ArcGIS

Sensitivity Analysis can be performed in ArcMap by implementing the Semivariogram Sensitivity tool (a geostatistical analysis tool).

GIS&T Body of Knowledge

This topic is referenced in section GC6-5 of the GIS&T Body of Knowledge.

References

  1. 1.0 1.1 Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D. Saisana, M., and Tarantola, S., 2008, Global Sensitivity Analysis. The Primer, John Wiley & Sons.
  2. Leamer, E., (1990) Let's take the con out of econometrics, and Sensitivity analysis would help. In C. Granger (ed.), Modelling Economic Series. Oxford: Clarendon Press 1990.
  3. Ravetz, J.R., 2007, No-Nonsense Guide to Science, New Internationalist Publications Ltd.
  4. Kennedy, P. (2007). A guide to econometrics, Fifth edition. Blackwell Publishing.
  5. Leamer, E. (1978). Specification Searches: Ad Hoc Inferences with Nonexperimental Data. John Wiley & Sons, Ltd, p. vi.
  6. Taleb, N. N., (2007) The Black Swan: The Impact of the Highly Improbable, Random House.
  7. Pilkey, O. H. and L. Pilkey-Jarvis (2007), Useless Arithmetic. Why Environmental Scientists Can't Predict the Future. New York: Columbia University Press.
  8. Cacuci, Dan G., Sensitivity and Uncertainty Analysis: Theory, Volume I, Chapman & Hall.
  9. Cacuci, Dan G., Mihaela Ionescu-Bujor, Michael Navon, 2005, Sensitivity And Uncertainty Analysis: Applications to Large-Scale Systems (Volume II), Chapman & Hall.
  10. Grievank, A. (2000). Evaluating derivatives, Principles and techniques of algorithmic differentiation. SIAM publisher.
  11. J.C. Helton, J.D. Johnson, C.J. Salaberry, and C.B. Storlie, 2006, Survey of sampling based methods for uncertainty and sensitivity analysis. Reliability Engineering and System Safety, 91:1175–1209.
  12. Oakley, J. and A. O'Hagan (2004). Probabilistic sensitivity analysis of complex models: a Bayesian approach. J. Royal Stat. Soc. B 66, 751–769.
  13. Morris, M. D. (1991). Factorial sampling plans for preliminary computational experiments. Technometrics, 33, 161–174.
  14. Campolongo, F., J. Cariboni, and A. Saltelli (2007). An effective screening design for sensitivity analysis of large models. Environmental Modelling and Software, 22, 1509–1518.
  15. Sobol’, I. (1990). Sensitivity estimates for nonlinear mathematical models. Matematicheskoe Modelirovanie 2, 112–118. in Russian, translated in English in Sobol’ , I. (1993). Sensitivity analysis for non-linear mathematical models. Mathematical Modeling & Computational Experiment (Engl. Transl.), 1993, 1, 407–414.
  16. Homma, T. and A. Saltelli (1996). Importance measures in global sensitivity analysis of nonlinear models. Reliability Engineering and System Safety, 52, 1–17.
  17. Saltelli, A., K. Chan, and M. Scott (Eds.) (2000). Sensitivity Analysis. Wiley Series in Probability and Statistics. New York: John Wiley and Sons.
  18. Saltelli, A. and S. Tarantola (2002). On the relative importance of input factors in mathematical models: safety assessment for nuclear waste disposal. Journal of American Statistical Association, 97, 702–709.
  19. Li, G., J. Hu, S.-W. Wang, P. Georgopoulos, J. Schoendorf, and H. Rabitz (2006). Random Sampling-High Dimensional Model Representation (RS-HDMR) and orthogonality of its different order component functions. Journal of Physical Chemistry A 110, 2474–2485.
  20. Li, G., W. S. W., and R. H. (2002). Practical approaches to construct RS-HDMR component functions. Journal of Physical Chemistry 106, 8721{8733.
  21. Rabitz, H. (1989). System analysis at molecular scale. Science, 246, 221–226.
  22. Hornberger, G. and R. Spear (1981). An approach to the preliminary analysis of environmental systems. Journal of Environmental Management 7, 7–18.
  23. Saltelli, A., S. Tarantola, F. Campolongo, and M. Ratto (2004). Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models. John Wiley and Sons.
  24. Sacks, J., W. J. Welch, T. J. Mitchell, and H. P. Wynn (1989). Design and analysis of computer experiments. Statistical Science 4, 409–435.
  25. Saltelli, A., M. Ratto, S. Tarantola and F. Campolongo (2005) Sensitivity Analysis for Chemical Models, Chemical Reviews, 105(7) pp 2811 – 2828.
  26. Saisana M., Saltelli A., Tarantola S., 2005, Uncertainty and Sensitivity analysis techniques as tools for the quality assessment of composite indicators, Journal Royal Statistical Society A, 168 (2), 307–323.
  27. Edwards,Janet (2002), Risk-Era: The Swedish Rescue Service's Tool for Community Risk Management. Esri conference 2002 Paper #00307. http://proceedings.esri.com/library/userconf/proc02/pap0307/p0307.htm
  28. Targeted Product Development & Value Modeling.(2010)http://leveragepoint.typepad.com/leveragepoint_perspective/2010/01/targeted-product-development-value-modeling.html

Bibliography

  • Fassò A. (2007) Statistical sensitivity analysis and water quality. In Wymer L. Ed, Statistical Framework for Water Quality Criteria and Monitoring. Wiley, New York.
  • Fassò A., Esposito E., Porcu E., Reverberi A.P., Vegliò F. (2003) Statistical Sensitivity Analysis of Packed Column Reactors for Contaminated Wastewater. Environmetrics. Vol. 14, n.8, 743 - 759.
  • Fassò A., Perri P.F. (2002) Sensitivity Analysis. In Abdel H. El-Shaarawi and Walter W. Piegorsch (eds) Encyclopedia of Environmetrics, Volume 4, pp 1968–1982, Wiley.
  • Saltelli, A., S. Tarantola, and K. Chan (1999). Quantitative model-independent method for global sensitivity analysis of model output. Technometrics 41(1), 39–56.
  • Santner, T. J.; Williams, B. J.; Notz, W.I. Design and Analysis of Computer Experiments; Springer-Verlag, 2003.

Special issue publications

Both are selection of papers presented at the 2007 Conference of Sensitivity Analysis of Model Output (SAMO) held in Budapest in June. See SAMO 2007 for the slides of the presentations.

See also