pcurve {pcurve}  R Documentation 
Fits a principal curve to a numeric multivariate dataset in arbitrary dimensions. Produces diagnostic plots.
pcurve(x, xcan = NULL, start = "ca", rank = FALSE, cv.fit = FALSE, penalty= 1, cv.all = FALSE, df = "vary", fit.meth = "spline", canfit = "lm",candf = FALSE, vary.adj = FALSE, subset, robust = FALSE, lowf = 0.5, min.df, max.df, max.df.cv.fit, ext.dist = TRUE, ext.dc = 0.9, metric = "bray", latent = FALSE, plot.pca = TRUE, thresh = 0.001, plot.true = TRUE, plot.init = FALSE, plot.segs = TRUE, plot.resp = TRUE, plot.cov = TRUE, maxit = 10, stretch = 2, fits = FALSE, prnt.fits = TRUE, trace = TRUE, trace.all = FALSE, pch = 1, row.chk0 = FALSE, col.chk0 = TRUE, use.loc = FALSE)
x 
numeric data matrix or data.frame. 
xcan 
data.frame or matrix of explanatory variables to be used in constrained PCs. 
start 
specifies how to determine the starting configuration (location of points on initial curve): "ca" = correspondence analysis; "pca" = principal components analysis with Euclidan metric; "pca.bc" = principal components analysis with BrayCurtis metric; "mds" = nonmetric multidimensional scaling with Euclidean metric; "mds.bc" = nonmetric multidimensional scaling with BrayCurtis metric; "cs.bc" = classical scaling (metric multidimensional scaling) with BrayCurtis metric; "ran" = random start. Or if start is numeric and of length dim(x)[1] a user supplied configuration will be used. 
rank 
if TRUE starting configuration is transformed to rank 
cv.fit 
if TRUE a final iteration using crossvalidation is done. 
penalty 
penalty for smoothing spline. A value of 1 corresponds to no penalty with values > 1 giving a lesssmoothed fit. Increasing the penalty for small data sets can reduce overfitting. If penalty = "np", penalty = 1 for N > 1000, penalty = 2 for N <=100, and penalty = 4log(N, 10) for N > 100 and N <= 1000. 
cv.all 
if TRUE a crossvalidated smoothing spline fit at each iteration. 
df 
if numeric specifies the df for the smoothing spline. 
fit.meth 
specifies smoother. "spline" = smooth.spline, "poisson" = poisson general additive model, "binomial" = binomial general additive model, "lowess" = lowess smoother (this argument overridden by robust = TRUE). 
canfit 
"lm" or "gam", model used to relate pc to xcan. 
candf 
if canfit = "gam", df for model. May be a single value or
a vector of FALSE or positive integers indicating dfs for each
explanatory variable in xcan. If FALSE, this is equivalent to
fx=FALSE in 
vary.adj 
if FALSE the same df are used for the smooth of each variable, otherwise each variable has its own df. 
subset 
used to take a subset of x and start (if numeric). 
robust 
if TRUE uses lowess smooths, if FALSE uses smoothing spline. 
lowf 
specifies the span of the lowess smooth. 
min.df 
specifies the min df for the smoothing. 
max.df 
specifies the max df for smoothing during crossvalidation. 
max.df.cv.fit 
specifies the max df for the smoothing. 
ext.dist 
if TRUE extended dissimilarities in calculation of
initial configuration using the flexible shortest path. If FALSE
standard dissimilarites are used (see De'ath, 1999b and

ext.dc 
critical distance, the toolong argument in 
metric 
similarity metric, the method argument in 
latent 
if FALSE locations are rescaled after each iteration to give distance along the curve; if TRUE no rescaling is done. 
plot.pca 
if TRUE the fitting is plotted (assuming plot.true = TRUE) in the first 2 dimensions of PCA space. 
thresh 
threshold value of difference in crossvalidation for ceasing iteration 
plot.true 
if TRUE the fitting process is plotted. 
plot.init 
if TRUE the initial fits to each variable are plotted. 
plot.segs 
if TRUE segments linking the fitted points on the curves to their corresponding data points are plotted. 
plot.resp 
if TRUE the final response curves are plotted. 
plot.cov 
if TRUE covariate partial effects are plotted (only if xcan is not null). 
maxit 
specifies the maximin number of iterations. 
stretch 
end segments of the curve are stretched by this factor at each iteration. 
fits 
if TRUE value of pcurve includes diagnostics for each variable. 
prnt.fits 
statistics on model fits printed. 
trace 
prints out useful fitting diagnostics at each iteration. 
trace.all 
if TRUE prints out all curve details at each iteration. 
pch 
symbol for plots 
row.chk0 
if TRUE checks for and removes rows of x identically 0. 
col.chk0 
if TRUE checks for and removes columns of x identically 0. 
use.loc 
if TRUE pauses during the fitting displays (left mouseclick to progress to next plot). 
See De'ath (1999a) for a full discussion of the functions and their application.
An object of class principal curve containing a list comprising
s 
fitted values 
tag 
order of points along the curve 
lambda 
locations along the curve 
dist 
sum of squared distances of points from the curve 
c 
call to pcurve 
x 
data to which the curve was fitted 
df 
degrees of freedom for the smoothers used in the fit 
fit.list 
diagnostics for each variable, only included if fits = TRUE. 
R port by Chris Walsh cwalsh@unimelb.edu.au from S+ library by Glenn De'ath g.death@aims.gov.au. Original S code for principal curve analysis by Trevor Hastie hastie@stat.stanford.edu.
De'ath, G. 1999a Principal Curves: a new technique for indirect and direct gradient analysis. Ecology 80, 2237–2253.
De'ath, G. 1999b Extended dissimilarity: method of robust estimation of ecological distances with high beta diversity. Plant Ecology 144, 191–199.
Gittins, R. 1985 Canonical Analysis. A review with applications in ecology. Berlin: SpringerVerlag.
Hastie, T.J and Tibshirani, R.J. 1990 Generalized additive models. London: Chapman and Hall.
Hastie, T.J. and Stuetzle, W. 1989 Principal Curves. Journal of the American Statistical Association 84, 502–516.
pcdiags.plt
, vegdist
, stepacross
#a simulated dataset with 4 response variables (taxa 14), #n=100. The response curve is Gaussian and noise is Poisson. data(sim4var) sim4fit < pcurve(sim4var, plot.init = FALSE, use.loc = TRUE) #Limestone grassland community example worked by De'ath (1999a), #from data in Gittins (1985) data(soilspec) species < sqrt(soilspec[,2:9]) envvar < soilspec[,10:12] #indirect gradient analysis spec.fit < pcurve(species, start = "mds.bc", plot.init = FALSE, use.loc = TRUE) #direct gradient analysis soilspec.fit < pcurve(species, xcan = envvar, start = "mds.bc", plot.init = FALSE, fits = TRUE, prnt.fits = TRUE, use.loc = TRUE)