PC-ORD Version 6 Review
"PC-ORD is recommended strongly for many undergraduate and
postgraduate students and particularly for those attempting to use multivariate analysis
of vegetation data for the first time. This spreadsheet-based software (Excel(R) or
Lotus 1-2-3(R)), now in its 6th version (2010), includes methods for virtually all the
methods of multivariate analysis covered in this text... Two particularly important
features are the ability to produce publication-quality graphics and the availability of a
3-level autopilot mode for non-metric multidimensional scaling (NMS). The latter is a
particularly strong asset. The program can handle very large data sets."
"Given its range of applications and excellent supporting materials it is very
reasonably priced and there are generous discounts for student users. Purchasers can also
obtain the HyperNiche multiplicative habitat modelling package for non-parametric
regression... at a significant discount."
"Most important of all, however, is the availability of a step-by-step introductory
guide for students (Peck, 2010) and a comprehensive manual (McCune and Grace, 2002),
linked to an online help system within the software itself."
Fast, Easy, and Publication Quality Ecological
Analyses with PC-ORD
Abstract
The recent version 6 release of the software PC-ORD Multivariate Analysis of Ecological
Data (MjM Software, www.pcord.com) earns a solid A grade for ease of use,
comprehensiveness of analysis tools, and excellent graphing capabilities. New features in
version 6 include the addition of several analysis tools (PCoA, RDA, SumF), the
enhancement of others (partial Mantel test, orthogonal rotation in NMS), additional
graphing features (boxplots, convex hulls) and new options that finally make importing
data from Excel easy.
Description
The combined effort of Dr. Bruce McCune (Oregon State University) and MjM Software, the
PC-ORD software is an integrated compilation of data management, exploration, graphing,
and analysis tools applicable to the kinds of data collected by ecologists. Data are
stored in a spreadsheet format that can accommodate up to 32000 rows and columns (23170 if
a distance matrix will be calculated) and can be imported directly from an Excel
spreadsheet or from text files of various formats. Options are available to assist users
with structuring their data matrices (e.g., appending matrices, deleting rows and
columns), summarizing data (e.g., sums, diversity indices, species lists), graphically
exploring data (e.g., boxplots, scatterplots, distribution curves), and modifying data
(e.g., common transformations and relativizations). Analysis tools, with both reasonable
default settings and the ability to tailor input parameters as needed, are included to
address both common and complex analyses questions:
- How does species composition vary along known gradients?
o polar ordination (Bray-Curtis
ordination)
- How can sample units be ordered based on the relative abundance of indicator species?
o weighted averaging ordination
- How does species composition or habitat condition vary among samples taken along unknown
gradients?
o principal components analysis and
nonmetric multidimensional scaling, as well as reciprocal averaging, detrended
correspondence analysis, and principal coordinates analysis
- How can new sample units be ordered to fit within the space of an existing ordination
model?
o predictive nonmetric multidimensional
scaling (NMS scores)
- How does species composition change when specific explanatory variables are manipulated?
o canonical correspondence analysis,
redundancy analysis
- Does species composition or habitat condition vary among locations or change following a
disturbance or as a result of applying a treatment?
o multiresponse permutation procedure,
distance-based MANOVA, SumF
- Which species behave similarly across a dataset?
o agglomerative clustering, two-way
clustering, TWINSPAN
- Which samples units can be classified into the same groups based on species or habitat
data?
o agglomerative clustering, two-way
clustering
- Which species are most abundant and frequent in treatment or habitat groups?
o indicator species analysis
- Do two datasets show similarity of compositional structure?
o Mantel test
Review
Although the menu-driven user interface is easy enough to navigate and the help menu is
extensive, the first hurdle is simply getting the data into PC-ORD. Because the software
operates on *.wk1 formatted spreadsheet files, which Microsoft Excel no longer supports,
most users will not be able to simply open their data files but must instead
import them from an Excel file or from a text file that has been exported from
some other software (e.g., a database). Fortunately, version 6 now includes options to do
this relatively painlessly. However, once the data are in PC-ORD, which only saves files
in the *.wk1 format, users must export the data if they wish to make complex edits using
Excel (such as using formulas). Most users will require a few back-and-forths before they
get their data just they way they want them. From there out, use of the drop-down menus is
straightforward and strategically placed help buttons can answer many
questions. Users who are not sure where to start may take advantage of the advisor
wizard, which asks a series of questions to lead users through a dichotomous key to
determine which data manipulations or analyses are most appropriate. Although the
questions posed by the wizard may be difficult for beginning users, they force the user to
think through important decisions in their analysis pathway, and help is provided in the
form of page numbers for relevant reading in the 2002 Analysis of
Ecological Communities book by Bruce McCune and James Grace.
Once users have their data in the software and have become accustomed to the interface,
they will be pleasantly surprised as the range of tools available in PC-ORD that tailor to
ecological datasets. While basic data management options are annoyingly lacking (e.g.,
copy and paste, aggregating across subsets of data), PC-ORD includes tools useful for
ecologists that are not commonly found in statistical software packages, such as the
ability to construct species area curves, plot smoothed variable distributions, create
species lists, or conduct analyses such as polar ordination (which they term
Bray-Curtis ordination), indicator species analysis, and SumF. In addition,
all of the most common ordination, classification, and group-testing tools are available
and many can be run using default settings or easily tailored to fit dataset constraints
and analysis objectives. For example, simple mouse clicks in the setup window are all that
is needed to run a PCA with a variance/covariance cross-products matrix, include biplot
scores for species, and add a randomization test. Similarly, NMS can be run using default
settings (in Autopilot mode) or the user can define input (e.g., number of
axes) and output (e.g., orthogonal rotation) settings. The information provided following
most procedures in the result file is so extensive that the beginning user may
be a bit overwhelmed, but more advanced users will appreciate having access to details
that can be cumbersome to program in other software.
Almost all users will find the graphics interface a welcome contrast to those available in
the most common statistical software packages. After running procedures that produce
output that can be graphed (e.g., ordinations, clustering dendrograms, species area
curves), users enter a graphics interface that is like a program within a program, in that
it is not possible to access the data files or main program menus while graphing output.
Once there, however, drop-down menus or toolbars provide many options for viewing output
optimally and extensive user preference options can be used to tailor graphics precisely
for the desired medium. Nice features include the ability to define and save preference
sets that control features such as plot type, fonts, colors and line and symbol types:
users can apply one set (e.g., in black and white) for manuscripts and another (e.g., in
color) for presentations.
Upgrading from version 5 to version 6 is probably worth it. A notable improvement is the
ability to now import/export spreadsheet data (as *.xls Excel files) without including the
special formatting information that PC-ORD requires at the top of the datafile. In
addition, several new analysis tools have been added, including the metric
multidimensional scaling procedure principal coordinates analysis (PCoA), the linear
equivalent to CCA in redundancy analysis (RDA), and a permutation test based on aggregated
F-statistics called SumF. Version 6 also includes additional graphing options, such as
creating boxplots and drawing convex hulls, which connect the dots around
groups in ordination space (Figure 1).
Many of the tools within PC-ORD are available in other software. For instance, other
commercial products have comparable versions of PCA (CANOCO, PRIMER, SAS, S+, etc.) and
clustering (PRIMER, SAS, SPSS, S+, etc.), and these and other tools are also available in
freeware (e.g., the VEGAN packing in R). While a few users may prefer options only
available in other software packages (e.g., including covariates in CCA within CANOCO,
applying distanced-based MANOVA to some unbalanced designs in PRIMER, or exploring species
area relationships in depth in EstimateS) the vast majority of users will find the tools
available in PC-ORD to be equivalent to, or better than, versions available elsewhere. For
instance, PC-ORD provides users the greatest flexibility available for running NMS and
provides the most detailed output, necessary for the proper interpretation of this complex
tool. The single greatest argument for PC-ORD, however, is simply that most of the
multivariate analysis tools needed to explore species datasets are present in the same
software package, eliminating the time consuming need to learn multiple programs, convert
datasets to different formats for software entry, and search around for the best tool for
making publication quality graphics.
As such, PC-ORD is competitively priced for student and single user licenses; site
licenses, however, can be quite expensive as the fee increases linearly with the number of
potential simultaneous users. Although available to order online, the software is not
downloadable, which netbook users (and procrastinators) may find inconvenient. Mac-only
users are also out of luck, as PC-ORD only runs on the Microsoft Windows platform (Windows
98 and later). The software does not come with a manual, but the extensive help menu
contains a wealth of descriptions, explanations, equations, and citations. In addition, a
new companion book is now available (Peck 2010) that
guides beginning users through the analysis process using the tools available in version
6.
References
McCune, B. & J. Grace. 2002. Analysis of Ecological Communities. MjM Software Design,
Gleneden Beach, Oregon. 300 p.
Peck, J.E. 2010. Multivariate Analysis for Community Ecologists: Step-by-Step using
PC-ORD. MjM Software Design, Gleneden Beach, Oregon. 162 p.
Figure. 1. PC-ORD can be used to draw convex hulls connecting sample units to
the same group. Here the four groups with three reps each represent four different periods
of time.
PC-ORD Version 5 Review
Journal of Vegetation Science 17: 843-844, 2006, reproduced with
permission
by Grandin, Ulf
Department of Environmental Assessment
Swedish University of Agricultural Sciences
Box 7050
SE 75007 Uppsala
PC-ORD5 review
JVS.pdf |
|
PC-ORD
version 5: A user-friendly toolbox for ecologists
Abstract
Recently, version 5 of PC-ORD, one of the major commercial software packages for
multivariate ecological community data analyses, was released. The new version offers a
whole range of techniques and methods for analyses of ecological data. It includes modules
for different types of ordination and classification, as well as other exploratory
techniques such as species-area curve analysis and indicator species analysis. Data are
stored in spreadsheets and can be easily manipulated in various ways. In essence, version
5 of PC-ORD offers the user a full toolbox for exploration and analysis of ecological
data, packed in a user-friendly environment.
Description
Recently, a new version of PC-ORD, a software package for multivariate analysis of
ecological data has been released. This package, developed by Bruce McCune and others
(McCune & Grace 2002) is one of the major commercial software packages for
multivariate ecological community data analyses. The new version 5 includes both
enhancements of existing analyses as well as new features. Among the new features is an
extended graph module with possibilities for 3D ordination plots, two-way cluster
dendrograms, dominance-diversity curves and frequency-abundance plots, and frequency
distributions. The main improvements to the previous graph module are better options for
editing graphs, and increased export options. The previous tray of analyses is extended
with permutation-based MANOVA with one-way, factorial, nested, and blocked designs,
two-way cluster analysis, smoothed univariate frequency distributions, and a function that
displays the most important summary features of a data set. The previous analyses are
enhanced with randomization tests for PCA, cluster analysis directly from a distance
matrix, writing of a distance matrix to spreadsheet or text file, and an option to break
down row and column summaries by a variable in the second matrix. To help users to select
the appropriate analysis, an advisor wizard, based on a decision tree, is added. Data
management and import/export has been improved. Version 5 allows for example simultaneous
adjustment of main and second matrices, and filtering rows by a criterion variable.
Review
Once the new user has become acquainted with the somewhat antiquated way of entering data,
PC-ORD version 5 offers a wide variety of tools for exploring data and testing hypotheses
in community ecology. The software is a collection of classical as well as more novel
statistics, used in numerical ecology. In addition to a variety of ordination and
classification techniques, the program also includes modules for testing group identity,
constructing species-area curves, Mantel tests and non-parametric MANOVA.
The interface is intuitive and easy to understand. It is easy to keep track of different
datasets and variables through complex analyses in several steps. There are a number of
possibilities for data transformation, manipulation and permutation. In all analyses,
results from intermediate calculations as well as final results are written to a results
window that can be saved. Additionally, ordination scores are written to a separate file,
which facilitates export.
For ecologists, multivariate statistical methods may be divided into hypothesis generating
(i.e. exploratory), and hypothesis testing methods (Økland 1996). Version 5 of PCORD
offers a wide variety of both types. The exploratory, or indirect, type of methods
includes traditional analyses such as principal components analysis, correspondence
analysis, and detrended correspondence analysis. In addition, there is an array of methods
for summarising and inspecting data, including e.g. calculation of diversity indices and
outlier analysis. Interesting and useful additional exploratory techniques include
species-area curves analysis and indicator species analysis (Dufrêne & Legendre
1997).
The hypothesis testing, or constrained, methods include both multidimensional scaling as
well as X2-based methods such as canonical correspondence analysis.
There are options for permutation tests of group identity but there is no option for
testing the significance of individual explanatory variables prior to a constrained
ordination. However, the graph module offers an elegant way of inspecting the contribution
of the individual explanatory variables. In ordination, PC-ORD can plot the relationship
between an ordination axis and individual species as well as explanatory variables.
For classification, PC-ORD offers a wide variety of tools. In the modules for both one-
and two-way hierarchical classification, a user may choose among many combinations of
distance measures and agglomeration techniques. The classical method TWINSPAN (Hill 1979)
is also included.
A new feature in the current version is a dichotomous decision tree for helping users to
select an appropriate method. The intentions behind this tree are obvious, but to be able
to answer the sometimes quite complex questions, the user has to be very familiar with
multivariate methods. My feeling is that a user who has the experience to be able to
answer the questions probably does not need the decision tree. Anyhow, for a user that has
just started using these techniques, the tree may be of great help, given that the user
knows the nomenclature. A more advanced user may use the tree to explore the capabilities
of the program.
Another interesting feature is the possibility of including your own programs as add-in
tools. In the standard installation, a program for calculating degree of nested-ness
(sensu Patterson & Atmar 1986) is included. This option may not be the most important
feature for a new or intermediate user, but is a means for the more advanced user to
personalise the program.
The graph module is easy to use and offers a user to view ordination results in both two
and three dimensions. An interesting feature is the possibility of drawing successional
vectors in ordination diagrams. Results of classifications are illustrated with
dendrograms in one or two dimensions, with scales showing distance, and remaining
information along a hierarchical tree. Produced graphs are of publication quality and can
be saved in a number of formats. There are numerous options for personalizing a graph,
including varying symbol sizes, labels, vectors, grids, and construction of joint plots.
Documentation of the program is only provided as comprehensive help files obtained from
within the program. The content of the help files is sufficient, with both examples as
well as theoretical background for the different techniques included in the program.
However, many users would probably prefer the documentation as a printed hardcopy.
PC-ORD can only be run under the operating system Windows, version Win98 or higher. The
program can accept data matrices with more than 500 million elements, or a maximum of
32000 columns or rows. This is probably larger than most ecological datasets. The price
for a single user licence is competitive compared to other similar commercial software. A
site licence is on the other hand relatively expensive as the cost increases with the
number of users. The website (www.pcord.com) offers online ordering, but the program
cannot be downloaded.
Many of the techniques and modules included in PCORD can also be found on the Internet as
self-standing freeware. VEGAN (Oksanen 2006) and Ginkgo (Font et al. 2006; see Bouxin
2005) are examples of free software for multivariate techniques, written for ecologists.
The PC-ORD module for species-area relationships is a light version of the freeware
EstimateS (Colwell 1997). TWINSPAN and IndVal which both are included in PCORD are also
available for free. However, in PC-ORD most necessary techniques for exploring and
analysing ecological data are collected in one common frame, with no need for repeated and
time-consuming data preparation for several programs.
In summary, PC-ORD offers a wide range of tools for analysing ecological data in a
user-friendly environment.
References
Bouxin, G. 2005. Review of Ginkgo, a multivariate analysis package. J. Veg. Sci. 16:
355-359.
Colwell, R.K. 1997. EstimateS: Statistical estimation of species richness and shared
species from samples. Version 5. User's guide and application. Published at:
http://viceroy.eeb.uconn.edu/estimates.
Dufrêne, M. & Legendre, P. 1997. Species assemblages and indicator species: the need
for a flexible asymmetrical approach. Ecol. Monogr. 67: 345-366.
Font, X., de Cáceres, M. & García, M. 2006. Ginkgo, a multivariate analysis tool.
See http://biodiver.bio.ub.es/vegana/index.html
Hill, M.O. 1979. TWINSPAN A FORTRAN program for arranging multivariate data in an
ordered two-way table by classification of the individuals and attributes. Cornell
University, Ithaca, NY, US.
Oksanen, J. 2006. Vegan: R functions for vegetation ecologists. Available at:
http://cc.oulu.fi/~jarioksa/softhelp/vegan.html
Patterson, B.D. & Atmar, W. 1986. Nested subsets and the structure of insular
mammalian faunas and archipelagos. Biol. J. Linn. Soc. 28: 65-82.
McCune, B. & Grace, J.B. (with Urban, D.L.) 2002. Analysis of ecological communities.
Mjm Software Design, Gleneden Beach, OR, US.
Økland, R.H. 1996. Are ordination and constrained ordination alternative or complementary
strategies in general ecological studies? J. Veg. Sci. 7: 289-292.
PC-ORD Version 5 Testimonials
Marlin L. Bowles
Plant Conservation Biologist
The Morton Arboretum, USA
I have published four other papers that used PCORD-generated graphics, as well as
another that used TWINSPAN on PCORD to identify ecologically related groups. I have
also used the Bray/Curtis program on PCORD to generate similarity indices for several
papers. Needless to say, I cant say enough about how useful PCORD has been.
The new version should help even more.
Ethan Bright, Ph.D. Candidate
School of Natural Resources and Environment
The University of Michigan
Ann Arbor, Michigan, USA
I predict PC-ORD 5 will be a well-received improvement on the previous version.
Besides improving the program's statistical and graphical routines, the addition of an
"analytical wizard" and its ability to keep track (with a text file) of the
decision-making process make this an invaluable resource for both student and professional
alike.
PC-ORD Version 4 Review
Bulletin of the Ecological Society of America 81:127-128. (2000)
by Aaron M. Ellison
Department of Biological Sciences
Mount Holyoke College
South Hadley, MA |
|
PC-ORD is a software package for multivariate analysis and
classification of ecological data. The DOS version (version 2) was reviewed in the January
1996 ESA Bulletin, and the first Windows (16-bit) version (version 3) was
reviewed in the April 1998 ESA Bulletin. In early 1999, MjM released the 32-bit
product (version 4), reviewed here, which is no longer compatible with Windows 3.x, and
like most new releases, demands more memory and disk space than earlier versions. If
you're no longer using Windows 3.x, upgrading to PC-ORD version 4 has significant
advantages over version 3.
Version 4 of PC-ORD requires an 80486 or better CPU, which means
it could run on the new computers in the Hubble Telescope, but it's unlikely you could run
Windows 95/98/NT efficiently on an 80486 CPU. The software occupies about 5.5 Mb of hard
disk space and uses a minimum of 8 Mb RAM. PC-ORD will use all available memory for matrix
operations, so the previous 16 Mb limit on matrix size has been removed. The only
remaining constraint to matrix size is that the default format for matrices, *.wk1 (Lotus
version 2.0), allows matrices no larger than 32,000 rows x 32,000 columns.
Available analysis routines fall into two broad groups: ordination and
classification. Of the routines in Table 1, blocked multiresponse
permutation procedure (MRBP) and weighted averaging are new to version4. Nonmetric
multidimensional scaling (NMS) has been significantly enhanced to include an
"autopilot mode" that speeds through multiple runs and significance tests, and a
"predictive-mode" NMS that calculates scores for new data points based on prior
ordinations.
Plotting of species in ordination space, by using weighted averaging to
calculate their scores, is now available in NMS, Principal Components Analysis (PCA), and
Bray-Curtis ordinations. Distance measures available include Euclidean (raw, squared, and
relativized), Sorenson (raw and relativized), Jaccard, correlation, and chi-squared. In
addition, data summaries (mean, SD, sum, minimum, maximum, skewness, kurtosis, CV, species
richness (S), Shannon-Weiner diversity (H'), Shannon-Weiner evenness (H'/ln[S]),
and Simpson's index of diversity (D) can be calculated for rows (sites) or
columns (species). Identification of outliers (matrix rows or columns) based on all
distance measures is accomplished by a separate routine. Basic species-area analysis for
determining adequacy of sampling is also included.
Beginning with version 3, PC-ORD produced publication-quality graphs
from most routines. These have been rounded out in version 4, which includes
publication-quality graphs for cluster analysis (dendrograms), species-area curves (with
confidence bands), and NMS scree plots. Graphics files are output as *.emf
(windows-enhanced metafiles) or *.bmp (bitmapped). Data management has also improved in
version 4: spreadsheets can be edited (albeit without full Windows capabilities), data
transformed or relativized, matrices transposed or multiplied, rows or columns deleted
(based on user-defined criteria, such as emptiness or sparseness), shuffled (randomized),
or smoothed. Acceptable formats for input data files remain small (*.wk1 spreadsheet,
PC-ORD compact format, PC-ORD version 1 format, DECORANA/TWINSPAN condensed format, list
format, and comma-separated values (CSV) format), but are easily created with ASCII text
editors or spreadsheet programs. Finally, like many new statistical packages, PC-ORD saves
work as a "project" (*.prj) file, which is really a set of associated
files (options, settings, matrices, results, graphics) produced by PC-ORD. This
facilitates organization of a set of analyses and increases efficiency, because options
and settings do not have to be re-entered at the start of each session. Individual files
can still be saved one at a time.
PC-ORD is still one of the most easily used, comprehensive packages for
multivariate analysis of ecological data. Many of the routines are unavailable in standard
statistical packages (which at best usually provide only PCA and cluster analysis). The
version 4 user's manual provides somewhat more information on the pitfalls of different
techniques and options than earlier manuals, but still assumes general familiarity with
the literature on multivariate methods. Routines in PC-ORD are current, and the authors
are quick to correct bugs and revise algorithms as new ideas are published. Incremental updates and patches are available free from their
web site <http://:www.pcord.com>. The package is reasonable
priced and should be considered strongly for research and teaching applications.
Literature cited
Beals, E. W. 1984. Bray-Curtis ordination: an effective strategy for
analysis of multivariate ecological data. Advances in Ecological Research 14:1-55.
Bray, J. R. and J. T. Curtis. 1957. An ordination of upland forest
communities of southern Wisconsin. Ecological Monographs 27:325-349.
Grieg-Smith, P. 1983. Quantitative plant ecology. Third edition.
Blackwell Scientific, Oxford, UK.
Hill, M. O. 1979a. DECORANA--A FORTRAN program for detrended
correspondence analysis and reciprocal averaging. Section of Ecology and Systematics,
Cornell University, Ithaca, New York, USA.
Hill, M. O. 1979b. TWINSPAN--A FORTRAN program for arranging
multivariate data in an ordered two-way table by classification of the individuals and
attributes. Section of Ecology and Systematics, Cornell University, Ithaca, New York, USA.
Mantel, N. 1967. The detection of disease clustering and generalized
regression approach. Cancer Research 27:209-220.
Matcher, P. M. 1976. Computational methods of multivariate analysis in
physical geography. John Wiley and Sons, London, UK.
Mielke, P. W., Jr. 1984. Meteorological applications of permutation
techniques based on distance functions. Pages 813-830 in P. R. Krishnaiah and P.
K. Sen, editors. Handbook of statistics. Volume 4. Elsevier Science, The Hague, The
Netherlands.
Okansen, J., and P. R. Minchin. 1997. Instability of ordination results
under changes in input data order: explanations and remedies. Journal of Vegetation
Science 8:447-454.
ter Braak, C. J. F. 1986. Canonical correspondence analysis: a new
eigenvector technique for multivariate direct gradient analysis. Ecology 67:1167-1179.
Whittaker, R. H. 1967. Gradient analysis of vegetation. Biological
Reviews 42:207-264.
Table 1. Analytical methods available in PC-ORD
version 4 for multivariate ordination and classification
Type and method |
Algorithm |
Ordination |
|
Bray-Curtis |
Bray and Curtis (1957), Beals (1984) |
Canonical Correspondence Analysis (CCA) |
ter Braak (1986) with corrections of Okasanen and Minchin
(1997) |
Detrended Correspondence Analysis (DCA) |
Hill (1979a) with corrections of Okasanen and
Minchin (1997) |
Nonmetric Multidimensional Scaling (NMS) |
Mather (1976) |
Principal Components Analysis (PCA) |
Grieg-Smith (1983) |
Reciprocal Averaging |
Hill (1979a) |
Weighted Averaging Classification |
Whittaker (1967) |
Classification |
|
Cluster Analysis |
|
Multiresponse Permutation Procedures (MRPP) |
Mielke (1984) |
Blocked MRPP (MRPP) |
Mielke (1984) |
Two-way Indicator Species Analysis (TWINSPAN) |
Hill (1979b) |
Indicator Species Analysis |
Durêne and Legendre (1997) |
Mantel test |
Mantel (1967) |
|