ANNOTATED SELECTED READINGS
(with annotation often taken directly from the abstracts)
Akaike, H. 1973. Information Theory
and an Extension of the Maximum Likelihood Principle. (reprinted, with commentary). Pages 610-624
In: Kotz, S., and N.L. Johnson, editors. Breakthroughs
in Statistics Volume 1. Foundations and Basic Theory. Springer Series in
Statistics, Perspectives in Statistics. Springer-Verlag:
Heavy-duty mathematical statistics paper on the development of links between likelihood and information theory.
Berger, J. O. 2003. Could Fisher, Jeffreys and Neyman have agreed on testing? Statistical Science 18:1-32.
Abstract: Ronald Fisher advocated testing using p-values, Harold Jeffreys proposed use of objective posterior probabilities of hypotheses and Jerzy Neyman recommended testing with fixed error probabilities. Each was quite critical of the other approaches. Most troubling for statistics and science is that the three approaches can lead to quite different practical conclusions. This article focuses on discussion of the conditional frequentist approach to testing, which is argued to provide the basis for a methodological unification of the approaches of Fisher, Jeffreys and Neyman. The idea is to follow Fisher in using p-values to define the "strength of evidence" in data and to follow his approach of conditioning on strength of evidence; then follow Neyman by computing Type I and Type II error probabilities, but do so conditional on the strength of evidence in the data. The resulting conditional frequentist error probabilities equal the objective posterior probabilities of the hypotheses advocated by Jeffreys.
Burnham, K. P., and D. R.
Anderson. 2004. Multimodel
inference: understanding AIC and BIC in model selection. Sociological Methods & Research, Vol. 33,
No. 2, 261-304 (2004)
Abstract: The model selection literature has been generally poor at reflecting the deep foundations of the Akaike information criterion (AIC) and at making appropriate comparisons to the Bayesian information criterion (BIC). There is a clear philosophy, a sound criterion based in information theory, and a rigorous statistical foundation for AIC. AIC can be justified as Bayesian using a "savvy" prior on models that is a function of sample size and the number of model parameters. Furthermore, BIC can be derived as a non-Bayesian result. Therefore, arguments about using AIC versus BIC for model selection cannot be from a Bayes versus frequentist perspective. The philosophical context of what is assumed about reality, approximating models, and the intent of model-based inference should determine whether AIC or BIC is used. Various facets of such multimodel inference are presented here, particularly methods of model averaging.
C. 1997. The method of
multiple working hypotheses. -
Appendix. p. 281-293 In: R. Hilborn and
M. Mangel, editors. Ecological Detective. Confronting Models with Data.
Chamberlin defends the method of multiple working hypothesis, on which likelihood rests, as the only method that is feasible in some disciplines (e.g., geology) and that supports greatest flexibility in our approach to science.
Efron, B. 1998. R. A. Fisher in the 21st Century. Statistical Science 13:95-122.
An excellent philosophical discussion of Fisher＊s development of likelihood as a basis for inference (and in contrast to frequentist and Bayesian approaches). Contains a series of comments by other statisticians at the end.
Hobbs, N. T., and R. Hilborn. 2006. Alternatives to statistical hypothesis testing in ecology: A guide to self teaching. Ecological Applications 16:5-19.
Johnson, J. B., and K. S. Omland. 2004. Model selection in ecology and evolution. Trends in Ecology and Evolution 19(2):101-108.
Recently, researchers in several areas of ecology and evolution have begun to change the way in which they analyze data and make biological inferences. Rather than the traditional null hypothesis testing approach, they have adopted an approach called model selection, in which several competing hypotheses are simultaneously confronted with data. Model selection can be used to identify a single best model, thus lending support to one particular hypothesis, or it can be used to make inferences based on weighted support from a complete set of competing models. Model selection is widely accepted and well developed in certain fields, most notably in molecular systematics and mark-recapture analysis. However, it is now gaining support in several other areas, from molecular evolution to landscape ecology. Here, we outline the steps of model selection and highlight several ways that it is now being implemented. By adopting this approach, researchers in ecology and evolution will find a valuable alternative to traditional null hypothesis testing, especially when more than one hypothesis is plausible.
Limpert, E., W. A. Staehl. 2001. Log-normal distributions across the sciences: keys and clues. Bioscience 51: 341-352.
Fisher, R. A. 1922. On the
Mathematical Foundations of Theoretical Statistics. (reprinted, with commentary). Pages 11-44 In:
Kotz, S., and N.L. Johnson, editors. Breakthroughs
in Statistics Volume 1. Foundations and Basic Theory. Springer Series in
Statistics, Perspectives in Statistics. Springer-Verlag:
Goffe, W. L., G. D. Ferrier, and J. Rogers. 1994. Global optimization of statistical functions with simulated annealing. Journal of Econometrics 60:65-99.
Discusses simulated annealing and the Metropolis algorithm, a flexible optimization method for parameter estimation
Nester, M. R. 1996. An applied statistician＊s creed. Applied Statistician 45:401-410.
An irreverent look at the shortcomings of traditional hypothesis testing. Hypothesis testing. as performed in the applied sciences, is criticized. Then assumptions that the author believes should be axiomatic in all statistical analyses are listed. These assumptions render many hypothesis tests superfluous. The author argues that the image of statisticians will not improve until the nexus between hypothesis testing and statistics is broken.
Platt, J. R. 1964. Strong inference. Science 146:347-353.
The flip-side. The manifesto of experimental approaches and statistical hypothesis testing.
Commentary on Chamberlin＊s seminal paper (see above).
Cadigan, N. G., and R. A. Myers. 2001. A comparison of gamma and lognormal maximum likelihood estimators in a sequential population analysis. Canadian Journal of Fisheries and Aquatic Science 58:560-567.
Canham, CD, Finzi, SC, Pacala, SW and DH.
Maximum likelihood techniques were used to estimate species-specific light extinction coefficients, using fish-eye photography combined with data on the locations and geometry of trees in the neighborhood around each photo point.
Canham, C. D.,
Papaik, M. J., and Latty, E. F.
variation in susceptibility to windthrow as a function of tree size and storm
severity for northern temperate tree species. Canadian Journal of
Studies of wind disturbance regimes have been hampered by the lack of methods to quantify variation in both storm severity and the responses of tree species to winds of varying intensity. In this paper, we report the development of a new, empirical method of simultaneously estimating both local storm severity and the parameters of functions that define species-specific variation in susceptibility to windthrow as a function of storm severity and tree size.
Canham, C. D., P.
T. LePage, and K. D. Coates. 2004. A
neighborhood analysis of canopy tree competition: effects of shading versus
crowding. Canadian Journal of
We have developed extensions of traditional distance-dependent, spatial competition analyses that estimate the magnitude of the competitive effects of neighboring trees on target tree growth as a function of the species, size, and distance to neighboring trees. Our analyses also estimate inter- and intra-specific competition coefficients and explicitly partition the competitive effects of neighbors into the effects of shading versus crowding.
Canham, C. D., M. L. Pace, M. J. Papaik, A. G. B. Primack, K. M. Roy, R. J. Maranger, R. P. Curran, and D. M. Spada. 2004. A spatially-explicit watershed-scale analysis of dissolved organic carbon in Adirondack lakes. Ecological Applications 14(3) 839-854.
Terrestrial ecosystems contribute significant amounts of dissolved
organic carbon (DOC) to aquatic ecosystems. Temperate lakes vary in DOC
concentration as a result of variation in the spatial configuration and
composition of vegetation within the watershed, hydrology, and within-lake
processes. We have developed and parameterized a spatially explicit model of
Canham, CD and M. Uriarte. In press. Analysis of neighborhood dynamics of forest ecosystems using likelihood methods and modeling. Ecological Applications, in press.
Presents spatially-explicit, maximum likelihood models of seedling recruitment and seed dispersal as case studies for the analyses of forest processes in a neighborhood framework.
Canham, C. D., M.Papaik, M.Uriarte, W. McWilliams, J.C. Jenkins, and M.Twery. in press. Neighborhood analyses of canopy tree competition along environmental gradients in New England forests. Ecological Applications, in press.
We use permanent plot data from the U.S.D.A. Forest Service Forest Inventory and Analysis (FIA) program for an analysis of the effects of competition on tree growth along environmental gradients for the 14 most abundant tree species in forests of northern New England. Our analysis estimates actual growth for each individual tree of a given species as a function of average potential diameter growth modified by 3 sets of scalars that quantify the effects on growth of (1) initial target tree size (DBH, in cm), (2) local environmental conditions, and (3) crowding by neighboring trees.
Clark, J. S., E. Macklin, and L. Wood. 1998. Stages and spatial scales of recruitment limitation in southern Appalachian forests. Ecological Monographs 68(2):213-235.
Argues that species can be divided according to their dispersal kernel: those that disperse far and scattered and those that disperse near and clumped. They argue for the use of different probability distributions to accommodate these two modes.
Clark, J. S., M. Silman, R. Kern, E. Macklin, and J. HilleRisLambers. 1999. Seed dispersal near and far: patterns across temperate and tropical forests. Ecology 80(5) 1475-1494.
Expands on the inverse modeling method (Ribbens et al below) of seedling recruitment by making the case that the function used to represent the dispersal curve (i.e., number of recruits as a function of distance from the parent tree) is inadequate. They argue that the function used by Ribbens et al. cannot accommodate long-distance dispersal and offer as an alternative a 2-Dt function that can accommodate dispersal near-and-far.
The goals of the article are to outline issues concerning the value of ecological models and some possible motivations for modeling, and to provide an entry point to the established modeling literature so that those that are beginning to think about using models in their research can integrate modeling usefully.
Kobe, R. K., and
K. D. Coates. 1997. Models
of sapling mortality as a function of growth to characterize interspecific
variation in shade tolerance of eight tree species of northwestern British
Columbia. Canadian Journal of
We characterized juvenile survivorship of 10 dominant tree species of oak transition-northern hardwood forests using species-specific mathematical models. The mortality models predict a sapling's probability of dying as a function of its recent growth history. We describe the statistical bases and the field methods used to calibrate the mortality models.
Legendre, P. 1993. Spatial autocorrelation: trouble or new paradigm? Ecology 74(6):1659-1673.
Autocorrelation is a very general statistical property of ecological variables observed across geographic space; its most common forms are patches and gradients. Spatial autocorrelation, which comes either from the physical forcing of environmental variables or from community processes, presents a problem for statistical testing because autocorrelated data violate the assumption of independence of most standard statistical procedures. The paper discusses first how autocorrelation in ecological variables can be described and measured, with emphasis on mapping techniques. Then, proper statistical testing in the presence of autocorrelation is briefly discussed. Finally, ways are presented of explicitly introducing spatial structures into ecological models. Two approaches are proposed; in the raw-data approach, the spatial structure takes the form of a polynomial of the x and y geographic coordinates of the sampling stations; in the matrix approach, the spatial structure is introduced in the form of a geographic distance matrix among locations.
LePage, P. T., C.
D. Canham, K. D. Coates, and P. Bartemucci.
2000. Seed abundance
versus substrate limitation of seedling recruitment in northern temperate
forests of British Columbia.
Canadian Journal of
We examine the influence of (i) the spatial distribution and abundance of parent trees (as seed sources) and (ii) the abundance and favourability of seedbed substrates, on seedling recruitment for the major tree species in northwestern interior cedar每hemlock forests of British Columbia, under four levels of canopy openness (full canopy, partial
canopy, large gap, and clearcut).
Loehle, C. 1987. Hypothesis testing in ecology: psychological aspects and the importance of theory maturation. The Quarterly Review of Biology 62(4):397-409.
Explores how confirmation bias influences hypothesis testing in ecology. This bias however, protects new ideas and allows them to mature before they are presented to the scientific community.
McGill, B. 2003. Strong and weak tests of macroecological theory. Oikos 102(3):679-685.
Argues against ※curve-fitting§ as a method of discovery. It analyzes patterns predicted by Hubbell＊s neutral theory.
Møller, A. P., and M. D. Jennions. 2002. How much variance can be explained by ecologists and evolutionary biologists? Oecologia 132:492-500.
Relies on a meta-analyses of published studies to argue that as ecologists and evolutionary biologists, we should only hope to explain about 5-10% of the variance in our data.
Oksanen, L. 2001. Logic of experiments in ecology: is pseudoreplication a pseudoissue? Oikos 94:27-38.
Peek, M. S., A. J. Leffler, S. D. Flint, and R. J. Ryel. 2003. How much variance is explained by ecologists? Additional perspectives. Oecologia 137:161-170.
Critiques the methods of Moller and Runions (see above) and argues that we should be able to explain a much higher percentage of the variance.
Reed, W. J. and E. A. Johnson. 2004. Statistical methods for estimating historical fire frequency from multiple fire-scar data. Can. J. For. Res. 34: 2306每2313.
This paper considers the statistical analysis of fire-interval charts based on fire-scar data. Estimation of the fire interval (expected time between scar-registering fires at any location) by maximum likelihood is presented. Because fires spread, causing a lack of independence in scar registration at distinct sites, an overdispersed binomial model is used, leading to a two-variable quasi-likelihood function. From this, point estimates, standard errors, and approximate confidence intervals for fire interval and related quantities can be derived. Methods of testing for the significance of spatial and temporal differences are also discussed. A simple example using artificial data is given to illustrate the computational steps involved, and an analysis of real fire-scar data is presented.
Ribbens, E., J. A. Silander Jr., and S. W. Pacala. 1994. Seedling recruitment in forests: calibrating models to predict patterns of tree seedling dispersion. Ecology 75(6):1794-1806.
Presents a method for calibrating spatial models of plant recruitment that does not require identifying the specific parent of each recruit. This method calibrates seedling recruitment functions by comparing tree seedling distributions with adult distributions via a maximum likelihood analysis. The models obtained from this method can then be used to predict the spatial distributions of seedlings from adult distributions.
Ruel, J. J., and M. P. Ayers. 1999. Jensen＊s inequality predicts effects of environmental variation. Trends in Ecology and Evolution 14(9):361-366.
Many biologists now recognize that environmental variance can exert important effects on patterns and processes in nature that are independent of average conditions. Jensen's inequality is a mathematical proof that is seldom mentioned in the ecological literature but which provides a powerful tool for predicting some direct effects of environmental variance in biological systems. Qualitative predictions can be derived from the form of the relevant response functions (accelerating versus decelerating). Knowledge of the frequency distribution (especially the variance) of the driving variables allows quantitative estimates of the effects. Jensen's inequality has relevance in every field of biology that includes nonlinear processes.
Schnurr, J. L., C. D. Canham, R. S. Ostfeld, and R. S. Inouye. 2004. Neighborhood analyses of small mammal dynamics: Implications for seed predation and seedling establishment. Ecology 85:741-755.
spatial distribution of. canopy trees in the temperate deciduous forests of the
Stephens, P.A., S.W. Buskirk, G.D. Hayward and C. Martinez del Rio. 2005. Information theory and hypothesis testing: a call for pluralism. Journal of Applied Ecology 42:4-12.
A major paradigm shift is occurring in the approach of ecologists to statistical analysis. The use of the traditional approach of null-hypothesis testing has been questioned and an alternative, model selection by information每theoretic methods, has been strongly promoted and is now widely used. For certain types of analysis, information每theoretic
approaches offer powerful and compelling advantages over null-hypothesis testing. 2. The benefits of information每theoretic methods are often framed as criticisms of null-hypothesis testing. We argue that many of these criticisms are neither irremediable nor always fair. Many are criticisms of the paradigm＊s application, rather than of its formulation. Information每theoretic methods are equally vulnerable to many such misuses. Care must be taken in the use of either approach but users of null-hypothesis tests, in particular, must greatly improve standards of reporting and interpretation.
Uriarte, M., C. D. Canham, J. Thompson, and J. K. Zimmerman. 2004. A neighborhood analysis of tree growth and survival in a hurricane-driven tropical forest. Ecological Monographs 74:591-614.
We present a likelihood-based regression method that was developed to analyze the effects of neighborhood competitive interactions and hurricane damage on tree growth and survival. The purpose of the method is to provide robust parameter estimatesfor a spatially explicit forest simulator and to gain insight into the processes that drive the patterns of species abundance in tropical forests. We test the method using census data
from the 16-ha Luquillo Forest Dynamics Plot in Puerto Rico and describe effects of the
spatial configuration, sizes, and species of neighboring trees on the growth and survival of 12 dominant tree species representing a variety of life history strategies.
Uriarte, M., R. Condit, C. D. Canham, and S. P. Hubbell. 2004. A spatially explicit model of sapling growth in a tropical forest: does the identity of neighbours matter? Journal of Ecology 92:348-360.
neighbourhood effects on sapling growth for 60 tree species in
Uriarte, M., C. D. Canham, J. Thompson, J. K. Zimmerman, and N. Brokaw. 2005. Seedling recruitment in a hurricane-driven tropical forest: light limitation, density-dependence and the spatial distribution of parent trees. Journal of Ecology 93:291-304.
We used inverse modelling to parameterize spatially-explicit seedling recruitment functions for nine canopy tree species in the Luquillo Forest Dynamics Plot (LFDP), Puerto Rico. We modelled the observed spatial variation in seedling recruitment following Hurricane Georges as a function of the potential number of seedlings at a given location (based on local source trees and the potential contribution of parents from outside of the mapped area) and of light levels and density-dependent mortality during establishment. We adopted the model comparison paradigm and compared the performance of increasingly complex models against a null model that assumes uniform seedling distribution across the plot.
models for tropical forests: a synthesis of models and methods.
Discusses advantages and limitations of different modeling approaches in tropical forests.
de Valpine, P., and A. Hastings. 2002. Fitting population models incorporating process noise and observation error. Ecological Monographs 72(1):57-76.
Evaluates a method for fitting models to time series of population abundances
that incorporates both process noise and observation error in a likelihood framework.
Wright, E. F., K. D. Coates, C. D. Canham, and P.
Bartemucci. 1998. Species
variability in growth response to light across climatic regions in northwestern
British Columbia. Canadian Journal
Characterizes variation in radial and height growth of saplings of 11 tree species across a range of light levels in boreal, sub-boreal, subalpine, and temperate forests of northwestern British Columbia.
Wright, E. F., C. D. Canham, and K. D. Coates. 2000. Effects
of suppression and release on sapling growth for 11 tree species of northern,
interior British Columbia. Canadian Journal of
Saplings of canopy tree species frequently
undergo alternating periods of suppression and release before reaching canopy
size. In this study, we document the effects of periods of suppression and
release on current responses to variation in light by saplings of the 11 major
tree species of northwestern, interior
Yuancai, L., and
B. R. Parresol. 2001. Remarks
on height-diameter modeling.
Burnham, K. P.,
and D. R. Anderson. 1998. Information
theory and log-likelihood models: a basis for model selection and Inference. Chapter
2 & Chapter
3 in Model selection and inference: a
practical information theoretic approach.
Edwards, E. W.
framework of inference. Chapter 1
(Pages 1-7) in Likelihood.
concept of likelihood. Chapter 2
(Pages 8-24) in Likelihood.
Hilborn, R., and
M. Mangel. 1997. Probability and
probability models: know your data.
(Pages 39-93) in The Ecological
Detective: Confronting Models with Data.
Pawitan, Y. 2001. Pages 8-19
in Statistical Modelling and Inference
Bolker, B. Bits of philosophy.
Casella, G., and R. L. Berger. 2001. Estimation: Point and Interval. Pages 4744-4749. International Encyclopedia of the Social & Behavioral Sciences. Elsevier Science Ltd.
Hardie, B. G. S., and P. S. Fader. 2001. Applied probability models in marketing research: Introduction. Supplementary materials for the A/R/T Forum Tutorial.
McLaughlin, M. P. 1993. Regress +. Appendix A: A compendium of common probability distributions. Version 2.3.
Oller, R., G. G車mez, and M. L. Calle. 2003. Likelihood inferences with interval-censored data. Documents de Recerca. Universitat de Vic.
Romeu, J. L. 2004. Censored Data.
Models for Binary Data. WWS509 每
Generalized Linear Models. Lecture Notes - Chapter 3.
2001. Poisson Models
for Count Data. WWS509 每 Generalized
Linear Models. Lecture Notes - Chapter 4.
2001. Survival Models. WWS509 每 Generalized Linear Models. Lecture
Notes - Chapter 7.
of Likelihood Theory. WWS509 每
Generalized Linear Models. Lecture Notes
- Appendix A.
over-dispersed count data. WWS509 每
Generalized Linear Models. Lecture Notes
- Appendix C.
Sit, V., and M.
Poulin-Costello. 1994. Catalogue
of Curves for Curve Fitting.
Handbook No. 4. Biometrics Information Handbook Series, W. Bergerud and
V. Sit, editors. Ministry of Forests
Stark, P. B.
measures of uncertainty in inverse problems. Workshop on Uncertainty in Inverse
Problems. Institute for Mathematics and
Dennis, B. 1996. Discussion: Should ecologists become Bayesians? Ecological Applications 6(4):1095-1103.
Bayesian statistics involve substantial changes in the methods and philosophy of science. Before adopting Bayesian approaches, ecologists should consider carefully whether or not scientific understanding will be enhanced. Frequentist statistical methods, while imperfect, have made an unquestioned contribution to scientific progress and are a workhorse of day-to-day research. Bayesian statistics, by contrast, have a largely untested track record. The papers in this special section on Bayesian statistics exemplify the difficulties inherent in making convincing scientific arguments with Bayesian reasoning.
Dennis, B. 2004. Statistics and the
scientific method in ecology. Chapter 11 in:
M. L. Taper and S. R. Lele, editors. The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical
Edwards, D. 1996. Comment: The first data analysis should be journalistic. Ecological Applications 6 (4):1090-1094.
Ellison, A. M. 1996. An introduction to Bayesian inference for ecological research and environmental decision-making. Ecological Applications 6(4):1036-1046.
In our statistical practice, we ecologists work comfortably within the hypothetico-deductive epistemology of Popper and the frequentist statistical methodology of Fisher. Consequently, our null hypotheses do not often take into account pre-existing data and do not require parameterization, our experiments demand large sample sizes, and we rarely use results from one experiment to predict the outcomes of future experiments. Comparative statistical statements such as ''we reject the null hypothesis st the 0.05 level,'' which reflect the likelihood of our data given our hypothesis, are of little use in communicating our results to nonspecialists or in describing the degree of certitude we have in our conclusions. In contrast, Bayesian statistical inference requires the explicit assignment of prior probabilities, based on existing information, to the outcomes of experiments. Such an assignment forces the parameterization of null and alternative hypotheses. The results of these experiments, regardless of sample size, then can be used to compute posterior probabilities of our hypotheses given the available data. Inferential conclusions in a Bayesian mode also are mon meaningful in environmental policy discussions: I argue that a ''Bayesian ecology'' would (a) make better use of pre-existing data; (b) allow stronger conclusions to be drawn from large-scale experiments with few replicates; and (c) be more relevant to environmental decision-making.
Ellison, A. M. (in review). Bayesian inference in ecology: historical antecedents, current developments, and future prospects. Ecology Letters.
Bayesian inference is an important new analytical tool among the plethora of statistical methods used by ecologists. In a Bayesian analysis, any and all information available before a study is conducted can be summarized in a model or hypothesis: the prior probability distribution. Bayes＊ Theorem uses the prior probability distribution and the likelihood distribution obtained from the observed data to update the prior and generate a posterior probability distribution. Posterior probability distributions are an alternative to conventional P-values. Posterior probability distributions provide a direct and intuitively meaningful measure of the degree of confidence that can be placed on parameter estimates. Further, Bayesian information-theoretic methods have been proposed that may provide robust measures of the probability of competing alternative hypotheses. Ecologists are using Bayesian inference in studies that range from predicting single species population dynamics to understanding ecosystem processes. Ecologists do not appreciate as well the philosophical underpinnings of Bayesian inference, however. In particular, the Bayesian assumption that model parameters are random variables directly conflicts with the assumption of frequentist and likelihood methods that model parameters have fixed (true) values. This assumption must be addressed forthrightly before deciding whether or not to use Bayesian methods to analyze ecological data.
Ellison, A. M. 2004. Essay: Statistics and Science, objectivity and truth: Comments on Dennis.
Pigluicci, M. Science as a Bayesian algorithm.
Royall, R. 2004. The
likelihood paradigm for statistical evidence. Pgs 119 - 152 in: M. L. Taper and S. R. Lele,
editors. The Nature of Scientific Evidence: Statistical, Philosophical, and Empirical Considerations.
Royall presents the basic philosophy behind likelihood methods. Contains commentaries and rejoinders.