Introduction
Who lives with whom and why? In one form or another this is a common
question of naturalists, farmers, natural resource managers, academics, and anyone who is
just curious about nature. This book describes statistical tools to help answer that
question.
Species come and go on their own but interact with each other and
their environments. Not only do they interact, but there is limited space to fill. If
space is occupied by one species, it is usually unavailable to another. So, if we take
abundance of species as our basic response variable in community ecology, then we must
work from the understanding that species responses are not independent and that a cogent
analysis of community data must consider this lack of independence.
We confront this interdependence among response variables by
studying their correlation structure. We also summarize how our sample units are related
to each other in terms of this correlation structure. This is one form of "data
reduction." Data reduction takes various forms, but it has two basic parts: (1)
summarizing a large number of observations into a few numbers and (2) expressing many
interrelated response variables in a more compact way.
Many people realize the need for multivariate data reduction after
collecting masses of community data. They become frustrated with analyzing the data one
species at a time. Although this is practical for very simple communities, it is
inefficient, awkward, and unsatisfying for even moderate-sized data sets, which may easily
contain 100 or more species.
We can approach data reduction by categorization (or
classification), a natural human approach to organizing complex systems. Or we can
approach it by summarizing continuous change in a large number of variables as a synthetic
continuous variable (ordination). The synthetic variable represents the combined variation
in a group of response variables. Data reduction by categorization or classification is
perhaps the most intuitive, natural approach. It is the first solution to which the human
mind will gravitate when faced with a complex problem, especially when we are trying to
elucidate relationships among objects, and those objects have many relevant
characteristics.
For example, consider a community data set consisting of 100 sample
units and the 80 species found in those sample units. This can be organized as a table
with 100 objects (rows) and 80 variables (columns). Faced with the problem of summarizing
the information in such a data set, our first reaction might be to construct some kind of
classification of the sample units. Such a classification boils down to assigning a
category to each of the sample units. In so doing, we have taken a data matrix with 80
variables and reduced it to a single variable with one value for each of the objects
(sample units).
The other fundamental method of data reduction is to construct a
small number of continuous variables representing a large number of the original
variables. This is possible and effective only if the original response variables covary.
It is not as intuitive as classification, because we must abandon the comfortable
typological model. But what we get is the capacity to represent continuous change as a
quantitative synthetic variable, rather than forcing continuous change into a set of
pigeonholes.
So data reduction is summarization, and summarization can result in
categories or quantitative variables. It is obvious that the need for data reduction is
not unique to community ecology. It shows up in many disciplines including sociology,
psychology, medicine, economics, market analysis, meteorology, etc. Given this broad need,
it is no surprise that many of the basic tools of data reduction - multivariate analysis -
have been widely written about and are available in all major statistical software
packages.
In community ecology, our response variables usually have distinct
and unwelcome properties compared with the variables expected by traditional multivariate
analyses. These are not just minor violations. These are fundamental problems with the
data that seriously weaken the effectiveness of traditional multivariate tools.
This book is about how species abundance as a response variable
differs from the ideal, how this creates problems, and how to deal effectively with those
problems. This book is also about how to relate species abundance to environmental
conditions, the various challenges to analysis, and ways to extract the most information
from a set of correlated predictors.
Definition of community
What is a "community" in ecology? The word has been used
many different ways and it is unlikely that it will ever be used consistently. Some use
"community" as an abstract group of organisms that recurs on the landscape. This
can be called the abstract community concept, and it usually carries with
it an implication of a level of integration among its parts that could be called
organismal or quasi-organismal. Others, including us, use the concrete community concept,
meaning simply the collection of organisms found at a specific place and time. The concrete
community is formalized by a sample unit which arbitrarily bounds and
compartmentalizes variation in species composition in space and time. The content of a
sample unit is the operational definition of a community.
The word "assemblage" has often been used in the sense of
a concrete community. Not only is this an awkward word for a simple concept, but the word
also carries unwanted connotations. It implies to some that species are independent and
noninteracting. In this book, we use the term "community" in the concrete sense,
without any conceptual or theoretical implications in itself.
Why study biological communities?
People have been interested in natural communities of organisms for
a long time. Prehistoric people (and many animals, perhaps) can be considered community
ecologists, since their ability to survive depended in part on their ability to recognize
habitats and to understand some of the environmental implications of species they
encountered. What differentiates community ecology as a scientific endeavor is that we
systematically collect data to answer the question "why" in the "who lives
with whom and why."
Another fundamental question of community ecology is "What
controls species diversity?" This springs from the more basic question, "What
species are here?" We keep backyard bird lists. We note which species of fish occur
in each place where we go fishing. We have mental inventories of our gardens. Inventorying
species is perhaps the most fundamental activity in community ecology. Few ecologists can
resist, however, going beyond that to try to understand which species associate with which
other species and why, how they respond to environmental changes, how they respond to
disturbance, and how they respond to our attempts to manipulate species.
It is not possible now, nor is it ever likely to be possible to make
reliable, specific, long-term predictions of community dynamics for specific sites based
on general ecological theory. This is not to say we should not try. But, we face the same
problems as long-term weather forecasters. Most of our predictive success will come from
short-term predictions applying local knowledge of species and environment to specific
sites and questions.
Purpose and structure of this book
The primary purpose of this book is to describe the most important
tools for data analysis in community ecology. Most of the tools described in this book can
be used either in the description of communities or the analysis of manipulative
experiments. The topics of community sampling and measuring diversity each deserve a book
in themselves. Rather than completely ignoring those topics, we briefly present some of
the most important issues relevant to community ecology. Explicitly spatial statistics as
applied to community ecology likewise deserve a whole book. We excluded this topic here,
except for a few tangential references.
Each analytical method in this book is described with a standard
format: Background, When to use it, How it works, What to report, Examples, and
Variations. The Background section briefly describes the development of
the technique, with emphasis on the development of its use in community ecology. It also
describes the general purpose of the method. When to use it describes
more explicitly the conditions and assumptions needed to apply the method. Knowing How
it works will also help most readers appreciate when to use a particular method.
Depending on the utility of the method to ecologists, the level of detail varies from an
overview to a full step-by-step description of the method. What to report
lists the methodo-logical options and key portions of the numerical results that should be
given to a reader. It does not include items that should be reported from any analysis,
such as data transformations (if any) and detection and handling of outliers. Examples
provide further guidance on how to use the methods and what to report. Variations
are available for most tech- niques. Describing all of them would result in a much more
expensive book. Instead, we emphasize the most useful and basic techniques. The references
in each section provide additional information about the variants. |