Analysis of Ecological Communities

Analysis of Ecological Communities is a book by Bruce McCune, James B. Grace, and Dean L. Urban on methods for analyzing multivariate data in community ecology, published by MjM Software Design, 2002. Bruce McCune is a professor of Department of Botany & Plant Pathology at Oregon State University, co-author of PC-ORD software, and author of several lichen books.

Analysis of Ecological Communities offers a rationale and guidance for selecting appropriate, effective, analytical methods in community ecology. The book is suitable as a textbook and reference book on methods for multivariate analysis of ecological communities and their environments. The book covers distance measures, data transformation, outlier analysis, ordination, cluster analysis, PCA, RA, CA, DCA, NMS, CCA, Bray-Curtis, MRPP, Mantel test, discriminant analysis, twinspan, classification and regression trees, structural equation modeling, and more. It also includes brief treatments of community sampling and diversity measures. The 304 page book is richly illustrated. It provides many examples from the literature and demonstrations of basic principles with simulated and real data sets.

Cover

Introduction

Contents

AnalysisBook.png

The book cover represents dust bunny distribution in ecological community data, with three levels of abstraction.

Background: a dust bunny is the accumulation of fluff, lint, and dirt particles in the corner of a room.

Middle: sample units in a 3-D species space, the three species forming a series of unimodal distributions along a single environmental gradient. Each axis represents abundance of one of the three species; each ball represents a sample unit. The vertical axis and the axis coming forward represent the two species peaking on the extremes of the gradient. The species peaking in the middle of the gradient is represented by the horizontal axis.

Foreground: The environmental gradient forms a strongly nonlinear shape in species space. The species represented by the vertical axis dominates one end of the environmental gradient, the species shown by the horizontal axis dominates the middle, and the species represented by the axis coming forward dominates the other end of the environmental gradient.

Successful representation of the environmental gradient requires a technique that can recover the underlying 1-D gradient from its contorted path through species space.


Contents

PREFACE iv
PART 1. OVERVIEW 1
1. INTRODUCTION 1
2. OVERVIEW OF COMMUNITY MATRICES 3
3. COMMUNITY SAMPLING AND MEASUREMENTS 13
4. SPECIES DIVERSITY 25
5. SPECIES ON ENVIRONMENTAL GRADIENTS 35
6. DISTANCE MEASURES 45
PART 2. DATA ADJUSTMENTS 58
7. DATA SCREENING 58
8. DOCUMENTING THE FLOW OF ANALYSES 62
9. DATA TRANSFORMATIONS 67
PART 3. DEFINING GROUPS WITH MULTIVARIATE DATA 80
10. OVERVIEW OF METHODS FOR FINDING GROUPS 80
11. HIERARCHICAL CLUSTERING 86
12. TWO-WAY INDICATOR SPECIES ANALYSIS 97
PART 4. ORDINATION 102
13. INTRODUCTION TO ORDINATION 102
14. PRINCIPAL COMPONENTS ANALYSIS 114
15. ROTATING ORDINATION AXES 122
16. NONMETRIC MULTIDIMENSIONAL SCALING 125
17. BRAY-CURTIS (POLAR) ORDINATION 143
18. WEIGHTED AVERAGING 149
19. CORRESPONDENCE ANALYSIS 152
20. DETRENDED CORRESPONDENCE ANALYSIS 159
21. CANONICAL CORRESPONDENCE ANALYSIS 164
22. RELIABILITY OF ORDINATION RESULTS 178
PART 5. COMPARING GROUPS 182
23. MULTIVARIATE EXPERIMENTS 182
24. MRPP (MULTI-RESPONSE PERMUTATION PROCEDURES) 188
25. INDICATOR SPECIES ANALYSIS 198
26. DISCRIMINANT ANALYSIS 205
27. MANTEL TEST 211
28. NESTED DESIGNS 218
PART 6. STRUCTURAL MODELS 222
29. CLASSIFICATION AND REGRESSION TREES 222
30. STRUCTURAL EQUATION MODELING 233
APPENDIX 1. ELEMENTARY MATRIX ALGEBRA 257
REFERENCES 260
INDEX 284

 


Introduction

    Who lives with whom and why? In one form or another this is a common question of naturalists, farmers, natural resource managers, academics, and anyone who is just curious about nature. This book describes statistical tools to help answer that question.

    Species come and go on their own but interact with each other and their environments. Not only do they interact, but there is limited space to fill. If space is occupied by one species, it is usually unavailable to another. So, if we take abundance of species as our basic response variable in community ecology, then we must work from the understanding that species responses are not independent and that a cogent analysis of community data must consider this lack of independence.

    We confront this interdependence among response variables by studying their correlation structure. We also summarize how our sample units are related to each other in terms of this correlation structure. This is one form of "data reduction." Data reduction takes various forms, but it has two basic parts: (1) summarizing a large number of observations into a few numbers and (2) expressing many interrelated response variables in a more compact way.

    Many people realize the need for multivariate data reduction after collecting masses of community data. They become frustrated with analyzing the data one species at a time. Although this is practical for very simple communities, it is inefficient, awkward, and unsatisfying for even moderate-sized data sets, which may easily contain 100 or more species.

    We can approach data reduction by categorization (or classification), a natural human approach to organizing complex systems. Or we can approach it by summarizing continuous change in a large number of variables as a synthetic continuous variable (ordination). The synthetic variable represents the combined variation in a group of response variables. Data reduction by categorization or classification is perhaps the most intuitive, natural approach. It is the first solution to which the human mind will gravitate when faced with a complex problem, especially when we are trying to elucidate relationships among objects, and those objects have many relevant characteristics.

    For example, consider a community data set consisting of 100 sample units and the 80 species found in those sample units. This can be organized as a table with 100 objects (rows) and 80 variables (columns). Faced with the problem of summarizing the information in such a data set, our first reaction might be to construct some kind of classification of the sample units. Such a classification boils down to assigning a category to each of the sample units. In so doing, we have taken a data matrix with 80 variables and reduced it to a single variable with one value for each of the objects (sample units).

    The other fundamental method of data reduction is to construct a small number of continuous variables representing a large number of the original variables. This is possible and effective only if the original response variables covary. It is not as intuitive as classification, because we must abandon the comfortable typological model. But what we get is the capacity to represent continuous change as a quantitative synthetic variable, rather than forcing continuous change into a set of pigeonholes.

    So data reduction is summarization, and summarization can result in categories or quantitative variables. It is obvious that the need for data reduction is not unique to community ecology. It shows up in many disciplines including sociology, psychology, medicine, economics, market analysis, meteorology, etc. Given this broad need, it is no surprise that many of the basic tools of data reduction - multivariate analysis - have been widely written about and are available in all major statistical software packages.

    In community ecology, our response variables usually have distinct and unwelcome properties compared with the variables expected by traditional multivariate analyses. These are not just minor violations. These are fundamental problems with the data that seriously weaken the effectiveness of traditional multivariate tools.

    This book is about how species abundance as a response variable differs from the ideal, how this creates problems, and how to deal effectively with those problems. This book is also about how to relate species abundance to environmental conditions, the various challenges to analysis, and ways to extract the most information from a set of correlated predictors.

Definition of community

    What is a "community" in ecology? The word has been used many different ways and it is unlikely that it will ever be used consistently. Some use "community" as an abstract group of organisms that recurs on the landscape. This can be called the abstract community concept, and it usually carries with it an implication of a level of integration among its parts that could be called organismal or quasi-organismal. Others, including us, use the concrete community concept, meaning simply the collection of organisms found at a specific place and time. The concrete community is formalized by a sample unit which arbitrarily bounds and compartmentalizes variation in species composition in space and time. The content of a sample unit is the operational definition of a community.

    The word "assemblage" has often been used in the sense of a concrete community. Not only is this an awkward word for a simple concept, but the word also carries unwanted connotations. It implies to some that species are independent and noninteracting. In this book, we use the term "community" in the concrete sense, without any conceptual or theoretical implications in itself.

Why study biological communities?

    People have been interested in natural communities of organisms for a long time. Prehistoric people (and many animals, perhaps) can be considered community ecologists, since their ability to survive depended in part on their ability to recognize habitats and to understand some of the environmental implications of species they encountered. What differentiates community ecology as a scientific endeavor is that we systematically collect data to answer the question "why" in the "who lives with whom and why."

    Another fundamental question of community ecology is "What controls species diversity?" This springs from the more basic question, "What species are here?" We keep backyard bird lists. We note which species of fish occur in each place where we go fishing. We have mental inventories of our gardens. Inventorying species is perhaps the most fundamental activity in community ecology. Few ecologists can resist, however, going beyond that to try to understand which species associate with which other species and why, how they respond to environmental changes, how they respond to disturbance, and how they respond to our attempts to manipulate species.

    It is not possible now, nor is it ever likely to be possible to make reliable, specific, long-term predictions of community dynamics for specific sites based on general ecological theory. This is not to say we should not try. But, we face the same problems as long-term weather forecasters. Most of our predictive success will come from short-term predictions applying local knowledge of species and environment to specific sites and questions.

Purpose and structure of this book

    The primary purpose of this book is to describe the most important tools for data analysis in community ecology. Most of the tools described in this book can be used either in the description of communities or the analysis of manipulative experiments. The topics of community sampling and measuring diversity each deserve a book in themselves. Rather than completely ignoring those topics, we briefly present some of the most important issues relevant to community ecology. Explicitly spatial statistics as applied to community ecology likewise deserve a whole book. We excluded this topic here, except for a few tangential references.

    Each analytical method in this book is described with a standard format: Background, When to use it, How it works, What to report, Examples, and Variations. The Background section briefly describes the development of the technique, with emphasis on the development of its use in community ecology. It also describes the general purpose of the method. When to use it describes more explicitly the conditions and assumptions needed to apply the method. Knowing How it works will also help most readers appreciate when to use a particular method. Depending on the utility of the method to ecologists, the level of detail varies from an overview to a full step-by-step description of the method. What to report lists the methodo-logical options and key portions of the numerical results that should be given to a reader. It does not include items that should be reported from any analysis, such as data transformations (if any) and detection and handling of outliers. Examples provide further guidance on how to use the methods and what to report. Variations are available for most tech- niques. Describing all of them would result in a much more expensive book. Instead, we emphasize the most useful and basic techniques. The references in each section provide additional information about the variants.